RTMDet: An Empirical Study of Designing Real-Time Object Detectors
AI-generated Key Points
- RTMDet is an efficient real-time object detector that surpasses the YOLO series.
- It is easily adaptable for various object recognition tasks such as instance segmentation and rotated object detection.
- The model architecture is designed to be more efficient by exploring compatible capacities in the backbone and neck.
- Large-kernel depth-wise convolutions are used as a basic building block to achieve efficiency.
- Soft labels are introduced during the calculation of matching costs to improve accuracy.
- Better training techniques are employed to enhance performance.
- RTMDet achieves impressive performance with 52.8% average precision (AP) on COCO dataset while running at over 300 frames per second (FPS) on an NVIDIA 3090 GPU.
- It outperforms current mainstream industrial detectors in terms of accuracy and efficiency.
- RTMDet offers a balanced parameter-accuracy trade-off across different model sizes, making it suitable for various application scenarios.
- It achieves state-of-the-art performance in real-time instance segmentation and rotated object detection.
- The code and models for RTMDet are publicly available on GitHub.
Authors: Chengqi Lyu, Wenwei Zhang, Haian Huang, Yue Zhou, Yudong Wang, Yanyi Liu, Shilong Zhang, Kai Chen
Abstract: In this paper, we aim to design an efficient real-time object detector that exceeds the YOLO series and is easily extensible for many object recognition tasks such as instance segmentation and rotated object detection. To obtain a more efficient model architecture, we explore an architecture that has compatible capacities in the backbone and neck, constructed by a basic building block that consists of large-kernel depth-wise convolutions. We further introduce soft labels when calculating matching costs in the dynamic label assignment to improve accuracy. Together with better training techniques, the resulting object detector, named RTMDet, achieves 52.8% AP on COCO with 300+ FPS on an NVIDIA 3090 GPU, outperforming the current mainstream industrial detectors. RTMDet achieves the best parameter-accuracy trade-off with tiny/small/medium/large/extra-large model sizes for various application scenarios, and obtains new state-of-the-art performance on real-time instance segmentation and rotated object detection. We hope the experimental results can provide new insights into designing versatile real-time object detectors for many object recognition tasks. Code and models are released at https://github.com/open-mmlab/mmdetection/tree/3.x/configs/rtmdet.
Ask questions about this paper to our AI assistant
You can also chat with multiple papers at once here.
Assess the quality of the AI-generated content by voting
Score: 0
Why do we need votes?
Votes are used to determine whether we need to re-run our summarizing tools. If the count reaches -10, our tools can be restarted.
The previous summary was created more than a year ago and can be re-run (if necessary) by clicking on the Run button below.
Similar papers summarized with our AI tools
Navigate through even more similar papers through a
tree representationLook for similar papers (in beta version)
By clicking on the button above, our algorithm will scan all papers in our database to find the closest based on the contents of the full papers and not just on metadata. Please note that it only works for papers that we have generated summaries for and you can rerun it from time to time to get a more accurate result while our database grows.
Disclaimer: The AI-based summarization tool and virtual assistant provided on this website may not always provide accurate and complete summaries or responses. We encourage you to carefully review and evaluate the generated content to ensure its quality and relevance to your needs.