tttoaster commited on
Commit
255addb
1 Parent(s): 57f5d82

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +75 -0
README.md CHANGED
@@ -1,3 +1,78 @@
1
  ---
2
  license: llama2
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: llama2
3
  ---
4
+ # SEED Multimodal
5
+
6
+ [Project Homepage](https://ailab-cvc.github.io/seed/)
7
+
8
+ **Powered by [CV Center, Tencent AI Lab](https://ailab-cvc.github.io), and [ARC Lab, Tencent PCG](https://github.com/TencentARC).**
9
+
10
+ ## Usage
11
+
12
+ ### Dependencies
13
+ - Python >= 3.8 (Recommend to use [Anaconda](https://www.anaconda.com/download/#linux))
14
+ - [PyTorch >= 1.11.0](https://pytorch.org/)
15
+ - NVIDIA GPU + [CUDA](https://developer.nvidia.com/cuda-downloads)
16
+
17
+ ### Installation
18
+ 1. Clone repo
19
+
20
+ ```bash
21
+ git clone https://github.com/AILab-CVC/SEED.git
22
+ cd SEED
23
+ ```
24
+
25
+ 2. Install dependent packages
26
+
27
+ ```bash
28
+ pip install -r requirements.txt
29
+ ```
30
+
31
+ ### Model Weights
32
+ We provide the pretrained SEED Tokenizer and De-Tokenizer, instruction tuned SEED-LLaMA-8B and SEED-LLaMA-14B.
33
+ Please download the checkpoints and save under the folder `./pretrained`.
34
+
35
+ To reconstruct the image from the SEED visual codes using unCLIP SD-UNet, please download the pretrained [unCLIP SD](https://huggingface.co/stabilityai/stable-diffusion-2-1-unclip).
36
+ Rename the checkpoint directory to **"diffusion_model"** and create a soft link to the "pretrained/seed_tokenizer" directory.
37
+
38
+
39
+ ### Inference for visual tokenization and de-tokenization
40
+ To discretize an image to 1D visual codes with causal dependency, and reconstruct the image from the visual codes using the off-the-shelf unCLIP SD-UNet:
41
+ ```bash
42
+ python scripts/seed_tokenizer_inference.py
43
+ ```
44
+
45
+ ### Launching Demo of SEED-LLaMA Locally
46
+ ```bash
47
+ sh start_backend.sh
48
+ sh start_frontend.sh
49
+ ```
50
+
51
+ ## Citation
52
+ If you find the work helpful, please consider citing:
53
+ ```bash
54
+ @article{ge2023making,
55
+ title={Making LLaMA SEE and Draw with SEED Tokenizer},
56
+ author={Ge, Yuying and Zhao, Sijie and Zeng, Ziyun and Ge, Yixiao and Li, Chen and Wang, Xintao and Shan, Ying},
57
+ journal={arXiv preprint arXiv:2310.01218},
58
+ year={2023}
59
+ }
60
+
61
+ @article{ge2023planting,
62
+ title={Planting a seed of vision in large language model},
63
+ author={Ge, Yuying and Ge, Yixiao and Zeng, Ziyun and Wang, Xintao and Shan, Ying},
64
+ journal={arXiv preprint arXiv:2307.08041},
65
+ year={2023}
66
+ }
67
+ ```
68
+
69
+ The project is still in progress. Stay tuned for more updates!
70
+
71
+ ## License
72
+ `SEED` is released under [Apache License Version 2.0](License.txt).
73
+
74
+ `SEED-LLaMA` is released under the original [License](https://ai.meta.com/resources/models-and-libraries/llama-downloads/) of [LLaMA2](https://huggingface.co/meta-llama/Llama-2-13b-chat-hf).
75
+
76
+ ## Acknowledgement
77
+ We thank the great work from [unCLIP SD](https://huggingface.co/stabilityai/stable-diffusion-2-1-unclip) and [BLIP2](https://github.com/salesforce/LAVIS).
78
+