Monocular 3D Object Detection with LiDAR Guided Semi Supervised Active Learning
AI-generated Key Points
- The paper presents a framework called MonoLiG for monocular 3D object detection with LiDAR guided semi-supervised active learning (SSAL)
- The framework leverages all modalities of collected data during model development and utilizes LiDAR to guide the data selection and training of monocular 3D detectors
- A LiDAR teacher, monocular student cross-modal framework is employed to distill information from unlabeled data as pseudo-labels
- A data noise-based weighting mechanism is proposed to handle differences in sensor characteristics and reduce the effect of propagating noise from LiDAR to monocular
- A sensor consistency-based selection score is proposed for selecting which samples to label and improve model performance, outperforming state-of-the-art active learning baselines by up to 17% in labeling costs
- Experimental results on KITTI and Waymo datasets validate the effectiveness of the proposed framework, consistently outperforming existing active learning baselines
- The training strategy achieves top rankings in KITTI 3D and birds-eye-view (BEV) monocular object detection official benchmarks by improving BEV Average Precision (AP) by 2.02
- Related work on active learning for object detection is discussed, specifically pool-based AL selection methods categorized into uncertainty-based and diversity-based approaches
- The authors extend current uncertainty strategies for AL selection by adapting the teacher-student paradigm and adding an inconsistency term, resulting in a better data savings rate than state-of-the-art AL baselines
- Overall, the paper introduces an innovative approach for monocular 3D object detection that effectively utilizes LiDAR guidance and semi-supervised active learning techniques, demonstrating superior performance compared to existing methods.
Authors: Aral Hekimoglu, Michael Schmidt, Alvaro Marcos-Ramiro
Abstract: We propose a novel semi-supervised active learning (SSAL) framework for monocular 3D object detection with LiDAR guidance (MonoLiG), which leverages all modalities of collected data during model development. We utilize LiDAR to guide the data selection and training of monocular 3D detectors without introducing any overhead in the inference phase. During training, we leverage the LiDAR teacher, monocular student cross-modal framework from semi-supervised learning to distill information from unlabeled data as pseudo-labels. To handle the differences in sensor characteristics, we propose a data noise-based weighting mechanism to reduce the effect of propagating noise from LiDAR modality to monocular. For selecting which samples to label to improve the model performance, we propose a sensor consistency-based selection score that is also coherent with the training objective. Extensive experimental results on KITTI and Waymo datasets verify the effectiveness of our proposed framework. In particular, our selection strategy consistently outperforms state-of-the-art active learning baselines, yielding up to 17% better saving rate in labeling costs. Our training strategy attains the top place in KITTI 3D and birds-eye-view (BEV) monocular object detection official benchmarks by improving the BEV Average Precision (AP) by 2.02.
Ask questions about this paper to our AI assistant
You can also chat with multiple papers at once here.
Assess the quality of the AI-generated content by voting
Why do we need votes?
Votes are used to determine whether we need to re-run our summarizing tools. If the count reaches -10, our tools can be restarted.
Similar papers summarized with our AI tools
Navigate through even more similar papers through atree representation
Look for similar papers (in beta version)
By clicking on the button above, our algorithm will scan all papers in our database to find the closest based on the contents of the full papers and not just on metadata. Please note that it only works for papers that we have generated summaries for and you can rerun it from time to time to get a more accurate result while our database grows.
Disclaimer: The AI-based summarization tool and virtual assistant provided on this website may not always provide accurate and complete summaries or responses. We encourage you to carefully review and evaluate the generated content to ensure its quality and relevance to your needs.