Spaces:
Runtime error
Runtime error
File size: 4,648 Bytes
b7eef6f |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 |
# Convolutional Reconstruction Model
Official implementation for *CRM: Single Image to 3D Textured Mesh with Convolutional Reconstruction Model*.
**CRM is a feed-forward model which can generate 3D textured mesh in 10 seconds.**
## [Project Page](https://ml.cs.tsinghua.edu.cn/~zhengyi/CRM/) | [Arxiv](https://arxiv.org/abs/2403.05034) | [HF-Demo](https://huggingface.co/spaces/Zhengyi/CRM) | [Weights](https://huggingface.co/Zhengyi/CRM)
https://github.com/thu-ml/CRM/assets/40787266/8b325bc0-aa74-4c26-92e8-a8f0c1079382
## Try CRM π»
* Try CRM at [Huggingface Demo](https://huggingface.co/spaces/Zhengyi/CRM).
* Try CRM at [Replicate Demo](https://replicate.com/camenduru/crm). Thanks [@camenduru](https://github.com/camenduru)!
## Install
### Step 1 - Base
Install package one by one, we use **python 3.9**
```bash
pip install torch==1.13.0+cu117 torchvision==0.14.0+cu117 torchaudio==0.13.0 --extra-index-url https://download.pytorch.org/whl/cu117
pip install torch-scatter==2.1.1 -f https://data.pyg.org/whl/torch-1.13.1+cu117.html
pip install kaolin==0.14.0 -f https://nvidia-kaolin.s3.us-east-2.amazonaws.com/torch-1.13.1_cu117.html
pip install -r requirements.txt
```
besides, one by one need to install xformers manually according to the official [doc](https://github.com/facebookresearch/xformers?tab=readme-ov-file#installing-xformers) (**conda no need**), e.g.
```bash
pip install ninja
pip install -v -U git+https://github.com/facebookresearch/xformers.git@main#egg=xformers
```
### Step 2 - Nvdiffrast
Install nvdiffrast according to the official [doc](https://nvlabs.github.io/nvdiffrast/#installation), e.g.
```bash
pip install git+https://github.com/NVlabs/nvdiffrast
```
## Inference
We suggest gradio for a visualized inference.
```
gradio app.py
```

For inference in command lines, simply run
```bash
CUDA_VISIBLE_DEVICES="0" python run.py --inputdir "examples/kunkun.webp"
```
It will output the preprocessed image, generated 6-view images and CCMs and a 3D model in obj format.
**Tips:** (1) If the result is unsatisfatory, please check whether the input image is correctly pre-processed into a grey background. Otherwise the results will be unpredictable.
(2) Different from the [Huggingface Demo](https://huggingface.co/spaces/Zhengyi/CRM), this official implementation uses UV texture instead of vertex color. It has better texture than the online demo but longer generating time owing to the UV texturing.
## Train
We provide training script for multivew generation and their data requirements.
To launch a simple one instance overfit training of multivew gen:
```shell
accelerate launch $accelerate_args train.py --config configs/nf7_v3_SNR_rd_size_stroke_train.yaml \
config.batch_size=1 \
config.eval_interval=100
```
To launch a simple one instance overfit training of CCM gen:
```shell
accelerate launch $accelerate_args train_stage2.py --config configs/stage2-v2-snr_train.yaml \
config.batch_size=1 \
config.eval_interval=100
```
### data prepare
To specify the data dir modify the following params in the configs/xxxx.yaml
```yaml
base_dir: <path to multiview piexl image basedir>
xyz_base: <path to related CCM image basedir>
caption_csv: <path to caption.csv>
```
The file tree of basedirs should satisfy as following:
```shell
base_dir
βββ uid1
β βββ 000.png
β βββ 001.png
β βββ 002.png
β βββ 003.png
β βββ 004.png
β βββ 005.png
βββ uid2
....
xyz_base
βββ uid1
β βββ xyz_new_000.png
β βββ xyz_new_001.png
β βββ xyz_new_002.png
β βββ xyz_new_003.png
β βββ xyz_new_004.png
β βββ xyz_new_005.png
βββ uid2
....
```
The `train_example` dir shows a minimal case of train data and `caption.csv` file.
## Todo List
- [x] Release inference code.
- [x] Release pretrained models.
- [ ] Optimize inference code to fit in low memery GPU.
- [x] Upload training code.
## Acknowledgement
- [ImageDream](https://github.com/bytedance/ImageDream)
- [nvdiffrast](https://github.com/NVlabs/nvdiffrast)
- [kiuikit](https://github.com/ashawkey/kiuikit)
- [GET3D](https://github.com/nv-tlabs/GET3D)
## Citation
```
@article{wang2024crm,
title={CRM: Single Image to 3D Textured Mesh with Convolutional Reconstruction Model},
author={Zhengyi Wang and Yikai Wang and Yifei Chen and Chendong Xiang and Shuo Chen and Dajiang Yu and Chongxuan Li and Hang Su and Jun Zhu},
journal={arXiv preprint arXiv:2403.05034},
year={2024}
}
```
|