Light-Head R-CNN: In Defense of Two-Stage Object Detector

AI-generated keywords: Light-Head R-CNN Two-Stage Object Detector YOLO SSD COCO

AI-generated Key Points

The license of the paper does not allow us to build upon its content and the key points are generated using the paper metadata rather than the full article.

  • Authors investigate limitations of two-stage methods compared to single-stage detectors in terms of speed
  • Faster R-CNN and R-FCN involve intensive computations after or before Region of Interest (RoI) warping
  • Heavy-head designs in architecture contribute to slow speed of these networks
  • Authors propose a new two-stage detector called Light-Head R-CNN with a lightweight head design
  • Light Head R CNN outperforms state-of-the-art object detectors on COCO dataset while maintaining time efficiency
  • Achieves impressive results by replacing backbone with smaller network such as Xception
  • Achieves 30.7 mmAP at 102 FPS on COCO, surpassing YOLO and SSD in terms of both speed and accuracy
  • Authors plan to make their code publicly available for further exploration and implementation
Also access our AI generated: Comprehensive summary, Lay summary, Blog-like article; or ask questions about this paper to our AI assistant.

Authors: Zeming Li, Chao Peng, Gang Yu, Xiangyu Zhang, Yangdong Deng, Jian Sun

Abstract: In this paper, we first investigate why typical two-stage methods are not as fast as single-stage, fast detectors like YOLO and SSD. We find that Faster R-CNN and R-FCN perform an intensive computation after or before RoI warping. Faster R-CNN involves two fully connected layers for RoI recognition, while R-FCN produces a large score maps. Thus, the speed of these networks is slow due to the heavy-head design in the architecture. Even if we significantly reduce the base model, the computation cost cannot be largely decreased accordingly. We propose a new two-stage detector, Light-Head R-CNN, to address the shortcoming in current two-stage approaches. In our design, we make the head of network as light as possible, by using a thin feature map and a cheap R-CNN subnet (pooling and single fully-connected layer). Our ResNet-101 based light-head R-CNN outperforms state-of-art object detectors on COCO while keeping time efficiency. More importantly, simply replacing the backbone with a tiny network (e.g, Xception), our Light-Head R-CNN gets 30.7 mmAP at 102 FPS on COCO, significantly outperforming the single-stage, fast detectors like YOLO and SSD on both speed and accuracy. Code will be made publicly available.

Submitted to arXiv on 20 Nov. 2017

Ask questions about this paper to our AI assistant

You can also chat with multiple papers at once here.

The license of the paper does not allow us to build upon its content and the AI assistant only knows about the paper metadata rather than the full article.

AI assistant instructions?

Results of the summarizing process for the arXiv paper: 1711.07264v2

This paper's license doesn't allow us to build upon its content and the summarizing process is here made with the paper's metadata rather than the article.

In this paper titled "Light-Head R-CNN: In Defense of Two-Stage Object Detector," authors Zeming Li, Chao Peng, Gang Yu, Xiangyu Zhang, Yangdong Deng, and Jian Sun investigate the limitations of typical two-stage methods compared to single-stage detectors like YOLO and SSD in terms of speed. They identify that Faster R-CNN and R-FCN involve intensive computations after or before Region of Interest (RoI) warping. Faster R-CNN utilizes two fully connected layers for RoI recognition, while R-FCN produces large score maps. These heavy-head designs in the architecture contribute to the slow speed of these networks. Even reducing the base model significantly does not lead to a proportional decrease in computation cost. To address these shortcomings in current two-stage approaches, the authors propose a new two-stage detector called Light-Head R-CNN. The key idea behind their design is to make the head of the network as lightweight as possible. They achieve this by using a thin feature map and a cost effective R-CNN subnet consisting of pooling and a single fully connected layer. Their ResNet 101 based Light Head R CNN outperforms state of the art object detectors on COCO dataset while maintaining time efficiency. Importantly, by simply replacing the backbone with a smaller network such as Xception, Light Head R CNN achieves impressive results. It achieves 30.7 mmAP at 102 FPS on COCO significantly surpassing single stage fast detectors like YOLO and SSD in terms of both speed and accuracy. The authors plan to make their code publicly available for further exploration and implementation which will enable researchers to further improve upon their work. Overall their research provides insights into improving the speed and efficiency of two stage object detection methods while achieving competitive performance on benchmark datasets like COCO.
Created on 01 Jan. 2024

Assess the quality of the AI-generated content by voting

Score: 0

Why do we need votes?

Votes are used to determine whether we need to re-run our summarizing tools. If the count reaches -10, our tools can be restarted.

The previous summary was created more than a year ago and can be re-run (if necessary) by clicking on the Run button below.

The license of this specific paper does not allow us to build upon its content and the summarizing tools will be run using the paper metadata rather than the full article. However, it still does a good job, and you can also try our tools on papers with more open licenses.

Similar papers summarized with our AI tools

Navigate through even more similar papers through a

tree representation

Look for similar papers (in beta version)

By clicking on the button above, our algorithm will scan all papers in our database to find the closest based on the contents of the full papers and not just on metadata. Please note that it only works for papers that we have generated summaries for and you can rerun it from time to time to get a more accurate result while our database grows.

Disclaimer: The AI-based summarization tool and virtual assistant provided on this website may not always provide accurate and complete summaries or responses. We encourage you to carefully review and evaluate the generated content to ensure its quality and relevance to your needs.