Open3DIS: Open-Vocabulary 3D Instance Segmentation with 2D Mask Guidance

AI-generated keywords: Open3DIS 3D Instance Segmentation Object Proposals Diverse Environments Performance Improvement

AI-generated Key Points

  • Open3DIS is a cutting-edge solution for object identification in diverse 3D environments
  • The challenge involves accurately identifying objects with varying shapes, sizes, and colors at the instance level
  • Introduces a new module that aggregates 2D instance masks across frames and maps them to geometrically coherent point cloud regions as high-quality object proposals
  • Combining refined proposals with class-agnostic 3D proposals from ISBNet leads to significant performance improvements on datasets like ScanNet200, S3DIS, and Replica
  • Outperforms previous methods like OVIR-3D and OpenMask3D on the ScanNet200 dataset in terms of Average Precision (AP) and APtail metrics
  • Achieves notable enhancement in AP compared to existing approaches by incorporating both 2D and class-agnostic 3D proposals
  • Competes closely with fully supervised techniques on various metrics, demonstrating effectiveness in segmenting rare objects
  • Showcased superior performance on different datasets such as ScanNet20 and Replica compared to other state-of-the-art methods across novel and base classes
  • Even under zero-shot scenarios on the Replica dataset without using class-agnostic 3D proposals, outperformed competing methods like OpenMask3D and OVIR-3D
Also access our AI generated: Comprehensive summary, Lay summary, Blog-like article; or ask questions about this paper to our AI assistant.

Authors: Phuc D. A. Nguyen, Tuan Duc Ngo, Evangelos Kalogerakis, Chuang Gan, Anh Tran, Cuong Pham, Khoi Nguyen

CVPR 2024. Project page: https://open3dis.github.io/
License: CC BY 4.0

Abstract: We introduce Open3DIS, a novel solution designed to tackle the problem of Open-Vocabulary Instance Segmentation within 3D scenes. Objects within 3D environments exhibit diverse shapes, scales, and colors, making precise instance-level identification a challenging task. Recent advancements in Open-Vocabulary scene understanding have made significant strides in this area by employing class-agnostic 3D instance proposal networks for object localization and learning queryable features for each 3D mask. While these methods produce high-quality instance proposals, they struggle with identifying small-scale and geometrically ambiguous objects. The key idea of our method is a new module that aggregates 2D instance masks across frames and maps them to geometrically coherent point cloud regions as high-quality object proposals addressing the above limitations. These are then combined with 3D class-agnostic instance proposals to include a wide range of objects in the real world. To validate our approach, we conducted experiments on three prominent datasets, including ScanNet200, S3DIS, and Replica, demonstrating significant performance gains in segmenting objects with diverse categories over the state-of-the-art approaches.

Submitted to arXiv on 17 Dec. 2023

Ask questions about this paper to our AI assistant

You can also chat with multiple papers at once here.

AI assistant instructions?

Results of the summarizing process for the arXiv paper: 2312.10671v3

In this study, we present Open3DIS, a cutting-edge solution for in diverse 3D environments. The challenge lies in accurately identifying objects with varying shapes, sizes, and colors at the instance level. Previous advancements in have utilized class-agnostic 3D instance proposal networks to localize objects and learn queryable features for each 3D mask. While these methods have shown promise in generating high-quality instance proposals, they struggle with identifying small-scale and geometrically ambiguous objects. Our novel approach introduces a new module that aggregates 2D instance masks across frames and maps them to geometrically coherent point cloud regions as high-quality object proposals. This innovative method overcomes the limitations of existing techniques by providing precise 3D instance masks independently of any pre-existing 3D models. By combining these refined proposals with class-agnostic 3D proposals from ISBNet, our model achieves significant performance improvements on prominent datasets such as ScanNet200, S3DIS, and Replica. Specifically, on the ScanNet200 dataset, our Open3DIS outperforms previous methods like OVIR-3D and OpenMask3D by substantial margins in terms of Average Precision (AP) and APtail metrics. By incorporating both and class-agnostic 3D proposals, we achieve a notable enhancement in AP compared to existing approaches. Furthermore, our method competes closely with fully supervised techniques on various metrics, demonstrating its effectiveness in segmenting rare objects. To assess the generalizability of our approach, on different datasets such as ScanNet20 and Replica. In both cases, Open3DIS showcased superior performance compared to other state-of-the-art methods across novel and base classes. Even under zero-shot scenarios on the Replica dataset without using class-agnostic 3D proposals, our approach still outperformed competing methods like OpenMask3D and OVIR-3D. Overall, our study highlights the effectiveness of merging 2D and 3D proposals for improved object segmentation in diverse real-world environments. The results demonstrate the robustness and versatility of Open3DIS in accurately identifying objects with varying characteristics across different datasets.
Created on 10 Apr. 2025

Assess the quality of the AI-generated content by voting

Score: 0

Why do we need votes?

Votes are used to determine whether we need to re-run our summarizing tools. If the count reaches -10, our tools can be restarted.

Similar papers summarized with our AI tools

Navigate through even more similar papers through a

tree representation

Look for similar papers (in beta version)

By clicking on the button above, our algorithm will scan all papers in our database to find the closest based on the contents of the full papers and not just on metadata. Please note that it only works for papers that we have generated summaries for and you can rerun it from time to time to get a more accurate result while our database grows.

Disclaimer: The AI-based summarization tool and virtual assistant provided on this website may not always provide accurate and complete summaries or responses. We encourage you to carefully review and evaluate the generated content to ensure its quality and relevance to your needs.