DiffuScene: Scene Graph Denoising Diffusion Probabilistic Model for Generative Indoor Scene Synthesis
AI-generated Key Points
- Traditional methods in indoor 3D scene synthesis rely on optimization techniques with predefined constraints based on room design rules, object category distributions, and human-object interactions.
- Recent advancements in deep generative models show promise in learning scene priors from large-scale datasets.
- GAN-based methods are used for implicit fitting of scene distributions through adversarial training, while VAE-based methods explicitly approximate scene distributions for better generative diversity.
- Auto-regressive models have been explored to predict objects sequentially but may struggle to capture relative attributes between objects and accumulate prediction errors over time.
- The introduced novel scene graph denoising diffusion probabilistic model generates 3D instance properties stored in a fully-connected scene graph and retrieves similar object geometries based on various attributes such as location, size, orientation, semantic information, and geometry features.
- By utilizing a diffusion model to determine placements and types of 3D instances, the model enables diverse applications including scene completion, arrangement, and text-conditioned synthesis.
- Experimental results on the 3D-FRONT dataset demonstrate that the model outperforms state-of-the-art methods by synthesizing more physically plausible and diverse indoor scenes.
- Extensive ablation studies further validate the effectiveness of the design choices made in the development of the scene diffusion model.
Authors: Jiapeng Tang, Yinyu Nie, Lev Markhasin, Angela Dai, Justus Thies, Matthias Nießner
Abstract: We present DiffuScene for indoor 3D scene synthesis based on a novel scene graph denoising diffusion probabilistic model, which generates 3D instance properties stored in a fully-connected scene graph and then retrieves the most similar object geometry for each graph node i.e. object instance which is characterized as a concatenation of different attributes, including location, size, orientation, semantic, and geometry features. Based on this scene graph, we designed a diffusion model to determine the placements and types of 3D instances. Our method can facilitate many downstream applications, including scene completion, scene arrangement, and text-conditioned scene synthesis. Experiments on the 3D-FRONT dataset show that our method can synthesize more physically plausible and diverse indoor scenes than state-of-the-art methods. Extensive ablation studies verify the effectiveness of our design choice in scene diffusion models.
Ask questions about this paper to our AI assistant
You can also chat with multiple papers at once here.
Assess the quality of the AI-generated content by voting
Score: 0
Why do we need votes?
Votes are used to determine whether we need to re-run our summarizing tools. If the count reaches -10, our tools can be restarted.
Similar papers summarized with our AI tools
Navigate through even more similar papers through a
tree representationLook for similar papers (in beta version)
By clicking on the button above, our algorithm will scan all papers in our database to find the closest based on the contents of the full papers and not just on metadata. Please note that it only works for papers that we have generated summaries for and you can rerun it from time to time to get a more accurate result while our database grows.
Disclaimer: The AI-based summarization tool and virtual assistant provided on this website may not always provide accurate and complete summaries or responses. We encourage you to carefully review and evaluate the generated content to ensure its quality and relevance to your needs.