Add metadata tags, link to paper
#1
by
nielsr
HF staff
- opened
README.md
CHANGED
@@ -1,3 +1,8 @@
|
|
|
|
|
|
|
|
|
|
|
|
1 |
# OmniBooth
|
2 |
|
3 |
> OmniBooth: Learning Latent Control for Image Synthesis with Multi-modal Instruction <br>
|
@@ -5,7 +10,7 @@
|
|
5 |
|
6 |
OmniBooth is a project focused on synthesizing image data following multi-modal instruction. Users can use text or image to control instance generation. This repository provides tools and scripts to process, train, and generate synthetic image data using COCO dataset, or self-designed data.
|
7 |
|
8 |
-
#### [Project Page](https://len-li.github.io/omnibooth-web) | [Paper](https://
|
9 |
|
10 |
code: https://github.com/Len-Li/OmniBooth
|
11 |
|
@@ -18,12 +23,6 @@ code: https://github.com/Len-Li/OmniBooth
|
|
18 |
- [Inference](#inference)
|
19 |
- [Behavior analysis](#behavior-analysis)
|
20 |
- [Data sturture](#instance-data-structure)
|
21 |
-
|
22 |
-
|
23 |
-
|
24 |
-
|
25 |
-
|
26 |
-
|
27 |
|
28 |
## Installation
|
29 |
|
@@ -45,9 +44,6 @@ To get started with OmniBooth, follow these steps:
|
|
45 |
pip install git+https://github.com/cocodataset/panopticapi.git
|
46 |
```
|
47 |
|
48 |
-
|
49 |
-
|
50 |
-
|
51 |
## Prepare Dataset
|
52 |
|
53 |
You can skip this step if you just want to run a demo generation. I've prepared demo mask in `data/instance_dataset` for generation. Please see [Inference](#inference).
|
@@ -59,7 +55,6 @@ To train OmniBooth, follow the steps below:
|
|
59 |
We use COCONut-S split.
|
60 |
Please download the COCONut-S file and relabeled-COCO-val from [here](https://github.com/bytedance/coconut_cvpr2024?tab=readme-ov-file#dataset-splits) and put it in `data/coconut_dataset` folder. I recommend to use [Kaggle](https://www.kaggle.com/datasets/xueqingdeng/coconut) link.
|
61 |
|
62 |
-
|
63 |
2. **Download the COCO dataset:**
|
64 |
```
|
65 |
cd data/coconut_dataset
|
@@ -73,9 +68,6 @@ To train OmniBooth, follow the steps below:
|
|
73 |
unzip annotations_trainval2017.zip
|
74 |
```
|
75 |
|
76 |
-
|
77 |
-
|
78 |
-
|
79 |
After preparation, you will be able to see the following directory structure:
|
80 |
|
81 |
```
|
@@ -183,7 +175,6 @@ The mask file is a binary mask that indicate the instance location. The image fi
|
|
183 |
}
|
184 |
```
|
185 |
|
186 |
-
|
187 |
## Acknowledgment
|
188 |
Additionally, we express our gratitude to the authors of the following opensource projects:
|
189 |
|
@@ -192,7 +183,6 @@ Additionally, we express our gratitude to the authors of the following opensourc
|
|
192 |
- [SyntheOcc](https://len-li.github.io/syntheocc-web/) (Network structure)
|
193 |
|
194 |
|
195 |
-
|
196 |
## BibTeX
|
197 |
|
198 |
```bibtex
|
@@ -204,12 +194,4 @@ Additionally, we express our gratitude to the authors of the following opensourc
|
|
204 |
}
|
205 |
```
|
206 |
|
207 |
-
This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
|
208 |
-
|
209 |
-
|
210 |
-
|
211 |
-
|
212 |
-
|
213 |
-
---
|
214 |
-
license: mit
|
215 |
-
---
|
|
|
1 |
+
---
|
2 |
+
pipeline_tag: image-to-image
|
3 |
+
license: mit
|
4 |
+
---
|
5 |
+
|
6 |
# OmniBooth
|
7 |
|
8 |
> OmniBooth: Learning Latent Control for Image Synthesis with Multi-modal Instruction <br>
|
|
|
10 |
|
11 |
OmniBooth is a project focused on synthesizing image data following multi-modal instruction. Users can use text or image to control instance generation. This repository provides tools and scripts to process, train, and generate synthetic image data using COCO dataset, or self-designed data.
|
12 |
|
13 |
+
#### [Project Page](https://len-li.github.io/omnibooth-web) | [Paper](https://huggingface.co/papers/2410.04932) | [Video](https://len-li.github.io/omnibooth-web/videos/teaser-user-draw.mp4) | [Checkpoint](https://huggingface.co/lilelife/Omnibooth)
|
14 |
|
15 |
code: https://github.com/Len-Li/OmniBooth
|
16 |
|
|
|
23 |
- [Inference](#inference)
|
24 |
- [Behavior analysis](#behavior-analysis)
|
25 |
- [Data sturture](#instance-data-structure)
|
|
|
|
|
|
|
|
|
|
|
|
|
26 |
|
27 |
## Installation
|
28 |
|
|
|
44 |
pip install git+https://github.com/cocodataset/panopticapi.git
|
45 |
```
|
46 |
|
|
|
|
|
|
|
47 |
## Prepare Dataset
|
48 |
|
49 |
You can skip this step if you just want to run a demo generation. I've prepared demo mask in `data/instance_dataset` for generation. Please see [Inference](#inference).
|
|
|
55 |
We use COCONut-S split.
|
56 |
Please download the COCONut-S file and relabeled-COCO-val from [here](https://github.com/bytedance/coconut_cvpr2024?tab=readme-ov-file#dataset-splits) and put it in `data/coconut_dataset` folder. I recommend to use [Kaggle](https://www.kaggle.com/datasets/xueqingdeng/coconut) link.
|
57 |
|
|
|
58 |
2. **Download the COCO dataset:**
|
59 |
```
|
60 |
cd data/coconut_dataset
|
|
|
68 |
unzip annotations_trainval2017.zip
|
69 |
```
|
70 |
|
|
|
|
|
|
|
71 |
After preparation, you will be able to see the following directory structure:
|
72 |
|
73 |
```
|
|
|
175 |
}
|
176 |
```
|
177 |
|
|
|
178 |
## Acknowledgment
|
179 |
Additionally, we express our gratitude to the authors of the following opensource projects:
|
180 |
|
|
|
183 |
- [SyntheOcc](https://len-li.github.io/syntheocc-web/) (Network structure)
|
184 |
|
185 |
|
|
|
186 |
## BibTeX
|
187 |
|
188 |
```bibtex
|
|
|
194 |
}
|
195 |
```
|
196 |
|
197 |
+
This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|