Ubaida10 commited on
Commit
4426e7f
·
verified ·
1 Parent(s): 75d9b97

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +118 -0
README.md ADDED
@@ -0,0 +1,118 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # SD-VITON-Virtual-Try-On
2
+ This is the official repository for the following paper:
3
+ > **Towards Squeezing-Averse Virtual Try-On via Sequential Deformation** [[arxiv]](https://arxiv.org/pdf/2312.15861.pdf)
4
+ >
5
+ > Sang-Heon Shim, Jiwoo Chung, Jae-Pil Heo
6
+ > Accepted by **AAAI 2024**.
7
+
8
+ ![teaser](assets/teaser.png) 
9
+
10
+ ## Notice
11
+ This repository is currently built only for sharing the source code of an academic research paper.
12
+ It has several limitations. Please check out them at below.
13
+
14
+ ## News
15
+ - *2024-01-31* We have released the source codes and checkpoints.
16
+
17
+
18
+ ## Installation
19
+
20
+ Clone this repository:
21
+
22
+ ```
23
+ git clone https://github.com/SHShim0513/SD-VITON.git
24
+ cd ./SD-VITON/
25
+ ```
26
+
27
+ Install PyTorch and other dependencies:
28
+
29
+ ```
30
+ conda create -n {env_name} python=3.8
31
+ conda activate {env_name}
32
+ conda install pytorch torchvision torchaudio cudatoolkit=11.1 -c pytorch-lts -c nvidia
33
+ pip install opencv-python torchgeometry Pillow tqdm tensorboardX scikit-image scipy timm==0.4.12
34
+ ```
35
+
36
+ ## Dataset
37
+ We train and evaluate our model using the dataset from the following [link](https://github.com/shadow2496/VITON-HD).
38
+ We assume that you have downloaded it into `./data`.
39
+
40
+ ## Inference
41
+
42
+ Here are the download links for each model checkpoint:
43
+
44
+ |Dataset|Network Type|Output Resolution|Google Cloud|
45
+ |--------|--------|--------|-----------|
46
+ | VITON-HD | Try-on condition generator | Appearance flows with 128 x 96 | [Download](https://drive.google.com/drive/folders/1sqKNvyTsF8HGAv72wV2nLIeZmYA1Za9V?usp=drive_link) |
47
+ | VITON-HD | Try-on image generator | Images with 1024 x 768 | [Download](https://drive.google.com/drive/folders/1nsbtVsjC2Y0XEZA9SYYrmI4K3TPr5--p?usp=drive_link) |
48
+
49
+ - AlexNet (LPIPS): [link](https://drive.google.com/file/d/1CJ2HLzlYjp0PXgbeAH90CdJhZbHRVEKN/view?usp=drive_link), we assume that you have downloaded it into `./eval_models/weights/v0.1`.
50
+
51
+ ```python
52
+ python3 test_generator.py --occlusion --test_name {test_name} --tocg_checkpoint {condition generator ckpt} --gpu_ids {gpu_ids} --gen_checkpoint {image generator ckpt} --datasetting unpaired --dataroot {dataset_path} --data_list {pair_list_textfile} --composition_mask
53
+ ```
54
+ ## Training
55
+
56
+ ### Try-on condition generator
57
+
58
+ ```python
59
+ python3 train_condition.py --gpu_ids {gpu_ids} --Ddownx2 --Ddropout --interflowloss --occlusion --tvlambda_tvob 2.0 --tvlambda_taco 2.0
60
+ ```
61
+
62
+ ### Try-on image generator
63
+
64
+ ```python
65
+ python3 train_generator.py --name test -b 4 -j 8 --gpu_ids {gpu_ids} --fp16 --tocg_checkpoint {condition generator ckpt path} --occlusion --composition_mask
66
+ ```
67
+ This stage takes approximately 4 days with two A6000 GPUs.
68
+
69
+ To use "--fp16" option, you should install apex library.
70
+
71
+ ## Limitations
72
+ Our work still has several limitations that are not an unique problem of ours in our best knowledge.
73
+
74
+ ### Issue #1: crack
75
+
76
+ Several samples have sufferred from a crack artifact.
77
+ In our best knowledge, the crack is amplified due to the up-sizing of last appearance flows (AFs).
78
+ *E.g.*, our network infers the last AFs with 128 x 96 resolution, and then up-scales to 1024 x 768.
79
+ Thereby, the crack regions are extended.
80
+
81
+ ![teaser](assets/fig_limitation.jpg) 
82
+
83
+ A slightly reduceable way will be to infer the last AFs with more closer to an image resolution (see "After").
84
+ We provide a checkpoint, where networks infer the AFs with 256 x 192 and an image with 512 x 384 resolution.
85
+
86
+ |Dataset|Network Type|Output Resolution|Google Cloud|
87
+ |--------|--------|--------|-----------|
88
+ | VITON-HD | Try-on condition generator | Appearance flows with 256 x 192 | [Download](https://drive.google.com/drive/folders/1IUJeJQgdwJgoLRZ3v3zlGqKZpVNa-FHM?usp=share_link) |
89
+ | VITON-HD | Try-on image generator | Images with 512 x 384 | [Download](https://drive.google.com/drive/folders/1X4-oAans5bg72aei9rCM0P2tCbFxuB26?usp=share_link) |
90
+
91
+ The corresponding script for inference is as follows:
92
+ ```python
93
+ python3 test_generator.py --occlusion --test_name {test_name} --tocg_checkpoint {condition generator ckpt} --gpu_ids {gpu_ids} --gen_checkpoint {image generator ckpt} --datasetting unpaired --dataroot {dataset_path} --data_list {pair_list_textfile} --fine_width 384 --fine_height 512 --num_upsampling_layers more --cond_G_ngf 48 --cond_G_input_width 384 --cond_G_input_height 512 --cond_G_num_layers 6
94
+ ```
95
+
96
+ ### Issue #2: clothes behind the neck
97
+ Same as other methods, our network cannot fully remove the clothes textures behind the neck.
98
+ Thereby, it remains in the generated samples.
99
+
100
+ A solution would be to mask out such regions when pre-processing the inputs.
101
+ We did not apply such additional technique, since it was not included in a dataset.
102
+
103
+ ## Acknowledgments
104
+
105
+ This repository is built based on HR-VITON repository. Thanks for the great work.
106
+
107
+ ## Citation
108
+
109
+ If you find this work useful for your research, please cite our paper:
110
+
111
+ ```
112
+ @article{shim2023towards,
113
+ title={Towards Squeezing-Averse Virtual Try-On via Sequential Deformation},
114
+ author={Shim, Sang-Heon and Chung, Jiwoo and Heo, Jae-Pil},
115
+ journal={arXiv preprint arXiv:2312.15861},
116
+ year={2023}
117
+ }
118
+ ```