File size: 1,893 Bytes
21d126f |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 |
---
license: apache-2.0
pipeline_tag: image-to-3d
library_name: 3dtopia-xl
tags:
- text-to-3d
- image-to-3d
---
# 3DTopia-XL
This repo contains the pretrained weights for *3DTopia-XL: Scaling High-quality 3D Asset Generation via Primitive Diffusion*.
[Project Page](https://3dtopia.github.io/3DTopia-XL/) | [Arxiv](https://arxiv.org/abs/2409.12957) | [Weights](https://huggingface.co/FrozenBurning/3DTopia-XL) | [Code](https://github.com/3DTopia/3DTopia-XL)
## Introduction
3DTopia-XL scales high-quality 3D asset generation using Diffusion Transformer (DiT) built upon an expressive and efficient 3D representation, **PrimX**. The denoising process takes 5 seconds to generate a 3D PBR asset from text/image input which is ready for the graphics pipeline to use.
<video controls autoplay src="https://cdn-uploads.huggingface.co/production/uploads/62fc8cf7ee999004b5a8b982/-f349zLT7hjWla9yxupSY.mp4"></video>
## Model Details
The model is trained on a ~256K subset of [Objaverse](https://huggingface.co/datasets/allenai/objaverse).
For more details, please refer to our paper.
## Usage
To download the model:
```python
from huggingface_hub import hf_hub_download
ckpt_path = hf_hub_download(repo_id="frozenburning/3DTopia-XL", filename="model_sview_dit_fp16.pt")
vae_ckpt_path = hf_hub_download(repo_id="frozenburning/3DTopia-XL", filename="model_vae_fp16.pt")
```
Please refer to our [repo](https://github.com/3DTopia/3DTopia-XL) for more details on loading and inference.
## Citation
```
@article{chen2024primx,
title={3DTopia-XL: High-Quality 3D PBR Asset Generation via Primitive Diffusion},
author={Chen, Zhaoxi and Tang, Jiaxiang and Dong, Yuhao and Cao, Ziang and Hong, Fangzhou and Lan, Yushi and Wang, Tengfei and Xie, Haozhe and Wu, Tong and Saito, Shunsuke and Pan, Liang and Lin, Dahua and Liu, Ziwei},
journal={arXiv preprint arXiv:2409.12957},
year={2024}
}
``` |