File information
The repository contains the following file information:
Note: file information is just provided as context for you, do not add it to the model card.
Project page
The project page URL we found has the following URL:
Github README
The Github README we found contains the following content:

Progressive Rendering Distillation: Adapting Stable Diffusion for Instant Text-to-Mesh Generation without 3D Data
β© Updates
- 2025-04-01: Presentation slides are now available for download.
- 2025-03-27: The paper is now available on Arxiv.
- 2025-03-03: Gradio and HuggingFace Demos are available.
- 2025-02-27: TriplaneTurbo is accepted to CVPR 2025.
π Features
- Fast Inference π: Our code excels in inference efficiency, capable of outputting textured mesh in around 1 second.
- Text Comprehension π: It demonstrates strong understanding capabilities for complex text prompts, ensuring accurate generation according to the input.
- 3D-Data-Free Training π ββοΈ: The entire training process doesn't rely on any 3D datasets, making it more resource-friendly and adaptable.
π€ Start local inference in 3 minutes
If you only wish to set up the demo locally, use the following code for the inference. Otherwise, for training and evaluation, use the next section of instructions for environment setup.
python -m venv venv
source venv/bin/activate
bash setup.sh
python gradio_app.py
π οΈ Official Installation
Create a virtual environment:
conda create -n triplaneturbo python=3.10
conda activate triplaneturbo
conda install pytorch==2.2.0 torchvision==0.17.0 torchaudio==2.2.0 pytorch-cuda=12.1 -c pytorch -c nvidia
(Optional, Recommended) Install xFormers for attention acceleration:
conda install xFormers -c xFormers
(Optional, Recommended) Install ninja to speed up the compilation of CUDA extensions
pip install ninja
Install major dependencies
pip install -r requirements.txt
Install iNGP
export PATH="/usr/local/cuda/bin:$PATH"
export LD_LIBRARY_PATH="/usr/local/cuda/lib64:$LD_LIBRARY_PATH"
pip install git+https://github.com/NVlabs/tiny-cuda-nn/#subdirectory=bindings/torch
If you encounter errors while installing iNGP, it is recommended to check your gcc version. Follow these steps to change the gcc version within your -cconda environment. After that, return to the project directory and reinstall iNGP and NerfAcc:
conda install -c conda-forge gxx=9.5.0
cd $CONDA_PREFIX/lib
ln -s /usr/lib/x86_64-linux-gnu/libcuda.so ./
cd <your project directory>
π Evaluation
If you only want to run the evaluation without training, follow these steps:
# Download the model from HuggingFace
huggingface-cli download --resume-download ZhiyuanthePony/TriplaneTurbo \
--include "triplane_turbo_sd_v1.pth" \
--local-dir ./pretrained \
--local-dir-use-symlinks False
# Download evaluation assets
python scripts/prepare/download_eval_only.py
# Run evaluation script
bash scripts/eval/dreamfusion.sh --gpu 0,1 # You can use more GPUs (e.g. 0,1,2,3,4,5,6,7). For single GPU usage, please check the script for required modifications
Our evaluation metrics include:
- CLIP Similarity Score
- CLIP Recall@1
For detailed evaluation results, please refer to our paper.
If you want to evaluate your own model, use the following script:
CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 python launch.py \
--config <path_to_your_exp_config> \
--export \
system.exporter_type="multiprompt-mesh-exporter" \
resume=<path_to_your_ckpt> \
data.prompt_library="dreamfusion_415_prompt_library" \
system.exporter.fmt=obj
After running the script, you will find generated OBJ files in outputs/<your_exp>/dreamfusion_415_prompt_library/save/<itXXXXX-export>
. Set this path as <OBJ_DIR>
, and set outputs/<your_exp>/dreamfusion_415_prompt_library/save/<itXXXXX-4views>
as <VIEW_DIR>
. Then run:
SAVE_DIR=<VIEW_DIR>
python evaluation/mesh_visualize.py \
<OBJ_DIR> \
--save_dir $SAVE_DIR \
--gpu 0,1,2,3,4,5,6,7
python evaluation/clipscore/compute.py \
--result_dir $SAVE_DIR
The evaluation results will be displayed in your terminal once the computation is complete.
π Training Options
1. Download Required Pretrained Models and Datasets
Use the provided download script to get all necessary files:
python scripts/prepare/download_full.py
This will download:
- Stable Diffusion 2.1 Base
- Stable Diffusion 1.5
- MVDream 4-view checkpoint
- RichDreamer checkpoint
- Text prompt datasets (3DTopia and DALLE+Midjourney)
2. Training Options
Option 1: Train with 3DTopia Text Prompts
# Single GPU
CUDA_VISIBLE_DEVICES=0 python launch.py \
--config configs/TriplaneTurbo_v0_acc-2.yaml \
--train \
data.prompt_library="3DTopia_prompt_library" \
data.condition_processor.cache_dir=".threestudio_cache/text_embeddings_3DTopia" \
data.guidance_processor.cache_dir=".threestudio_cache/text_embeddings_3DTopia"
For multi-GPU training:
# 8 GPUs with 48GB+ memory each
CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 python launch.py \
--config configs/TriplaneTurbo_v1_acc-2.yaml \
--train \
data.prompt_library="3DTopia_361k_prompt_library" \
data.condition_processor.cache_dir=".threestudio_cache/text_embeddings_3DTopia" \
data.guidance_processor.cache_dir=".threestudio_cache/text_embeddings_3DTopia"
Option 2: Train with DALLE+Midjourney Text Prompts
Choose the appropriate command based on your GPU configuration:
# Single GPU
CUDA_VISIBLE_DEVICES=0 python launch.py \
--config configs/TriplaneTurbo_v0_acc-2.yaml \
--train \
data.prompt_library="DALLE_Midjourney_prompt_library" \
data.condition_processor.cache_dir=".threestudio_cache/text_embeddings_DE+MJ" \
data.guidance_processor.cache_dir=".threestudio_cache/text_embeddings_DE+MJ"
For multi-GPU training (higher performance):
# 8 GPUs with 48GB+ memory each
CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 python launch.py \
--config configs/TriplaneTurbo_v1_acc-2.yaml \
--train \
data.prompt_library="DALLE_Midjourney_prompt_library" \
data.condition_processor.cache_dir=".threestudio_cache/text_embeddings_DE+MJ" \
data.guidance_processor.cache_dir=".threestudio_cache/text_embeddings_DE+MJ"
3. Configuration Notes
Memory Requirements:
- v1 configuration: Requires GPUs with 48GB+ memory
- v0 configuration: Works with GPUs that have less memory (46GB+) but with reduced performance
Acceleration Options:
- Use
_acc-2.yaml
configs for gradient accumulation to reduce memory usage
- Use
Advanced Options:
- For highest quality, use
configs/TriplaneTurbo_v1.yaml
withsystem.parallel_guidance=true
(requires 98GB+ memory GPUs) - To disable certain guidance components: add
guidance.rd_weight=0 guidance.sd_weight=0
to the command
- For highest quality, use
π Citation
If you find this work helpful, please consider citing our paper:
@article{ma2025progressive,
title={Progressive Rendering Distillation: Adapting Stable Diffusion for Instant Text-to-Mesh Generation without 3D Data},
author={Ma, Zhiyuan and Liang, Xinyue and Wu, Rongyuan and Zhu, Xiangyu and Lei, Zhen and Zhang, Lei},
booktitle={Proceedings of the IEEE/CVF conference on computer vision and pattern recognition},
year={2025}
}
π Acknowledgement
Our code is heavily based on the following works
- ThreeStudio: A clean and extensible codebase for 3D generation via Score Distillation.
- MVDream: Used as one of our multi - view teachers.
- RichDreamer: Serves as another multi - view teacher for normal and depth supervision
- 3DTopia: Its text caption dataset is applied in our training and comparison.
- DiffMC: Our solution uses its differentiable marching cube for mesh rasterization.
- NeuS: We implement its SDF - based volume rendering for dual rendering in our solution
- Downloads last month
- 0
Model tree for ZhiyuanthePony/TriplaneTurbo
Base model
stabilityai/stable-diffusion-2-1-base