--- license: mit tags: - 3d pipeline_tag: image-to-3d --- Real3D **Model Details**: **Model Description**: We use the model architecture provided by [TripoSR](https://github.com/VAST-AI-Research/TripoSR), which is a Transformer model for 2D-to-3D mapping built on [LRM](https://arxiv.org/abs/2311.04400). We scale it further on in-the-wild image collections by enabling unsupervised self-training and automatric data curation. * Developed by: [Hanwen Jiang](https://hwjiang1510.github.io/) * License: MIT * Hardware: We train Real3D on 1 node (8GPU) with equivalent batch size of 80 for 5-6 days. **Model Sources**: * Paper: https://arxiv.org/abs/2406.08479 * Project: https://hwjiang1510.github.io/Real3D/ * Code for training and evaluation: https://github.com/hwjiang1510/Real3D **Training Data**: Real3D is jointly trained on synthetic data (Objaverse) and in-the-wild image collections. The former prevents training divergence, the latter introduces new knowldege from a broader distribution of real images. We use Objaverse renderings from [Zero-1-to-3](https://github.com/cvlab-columbia/zero123) and [GObjaverse](https://aigc3d.github.io/gobjaverse/). The in the wild images are from [ImageNet](https://www.image-net.org/), [OpenImages](https://storage.googleapis.com/openimages/web/index.html), etc. **Misuse, Malicious Use, and Out-of-Scope Use**: The model should not be used to intentionally create or disseminate 3D models that people would foreseeably find disturbing, distressing, or offensive; or content that propagates historical or current stereotypes.