PartGen: Part-level 3D Generation and Reconstruction with Multi-View Diffusion Models
Abstract
Text- or image-to-3D generators and 3D scanners can now produce 3D assets with high-quality shapes and textures. These assets typically consist of a single, fused representation, like an implicit neural field, a Gaussian mixture, or a mesh, without any useful structure. However, most applications and creative workflows require assets to be made of several meaningful parts that can be manipulated independently. To address this gap, we introduce PartGen, a novel approach that generates 3D objects composed of meaningful parts starting from text, an image, or an unstructured 3D object. First, given multiple views of a 3D object, generated or rendered, a multi-view diffusion model extracts a set of plausible and view-consistent part segmentations, dividing the object into parts. Then, a second multi-view diffusion model takes each part separately, fills in the occlusions, and uses those completed views for 3D reconstruction by feeding them to a 3D reconstruction network. This completion process considers the context of the entire object to ensure that the parts integrate cohesively. The generative completion model can make up for the information missing due to occlusions; in extreme cases, it can hallucinate entirely invisible parts based on the input 3D asset. We evaluate our method on generated and real 3D assets and show that it outperforms segmentation and part-extraction baselines by a large margin. We also showcase downstream applications such as 3D part editing.
Community
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- MVBoost: Boost 3D Reconstruction with Multi-View Refinement (2024)
- Sharp-It: A Multi-view to Multi-view Diffusion Model for 3D Synthesis and Manipulation (2024)
- 3DEnhancer: Consistent Multi-View Diffusion for 3D Enhancement (2024)
- Edify 3D: Scalable High-Quality 3D Asset Generation (2024)
- Style3D: Attention-guided Multi-view Style Transfer for 3D Object Generation (2024)
- Direct and Explicit 3D Generation from a Single Image (2024)
- Gen-3Diffusion: Realistic Image-to-3D Generation via 2D&3D Diffusion Synergy (2024)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment:
@librarian-bot
recommend
Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper