sudemai commited on
Commit
2ffffce
1 Parent(s): 8ea52fa

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +103 -93
README.md CHANGED
@@ -1,93 +1,103 @@
1
- <h1 align="center">OmniGen: Unified Image Generation</h1>
2
-
3
-
4
- <p align="center">
5
- <a href="">
6
- <img alt="Build" src="https://img.shields.io/badge/Project%20Page-OmniGen-yellow">
7
- </a>
8
- <a href="https://arxiv.org/abs/2409.11340">
9
- <img alt="Build" src="https://img.shields.io/badge/arXiv%20paper-2409.11340-b31b1b.svg">
10
- </a>
11
- <a href="https://huggingface.co/spaces/Shitao/OmniGen">
12
- <img alt="License" src="https://img.shields.io/badge/HF%20Demo-🤗-lightblue">
13
- </a>
14
- <a href="https://huggingface.co/Shitao/OmniGen-v1">
15
- <img alt="Build" src="https://img.shields.io/badge/HF%20Model-🤗-yellow">
16
- </a>
17
- </p>
18
-
19
- <h4 align="center">
20
- <p>
21
- <a href=#2-news>Credits for Quantized version</a> |
22
- <a href=#3-methodology>Methodology</a> |
23
- <a href=#4-what-can-omnigen-do>Capabilities</a> |
24
- <a href="#license">License</a> |
25
- <a href="#citation">Citation</a>
26
- <p>
27
- </h4>
28
-
29
-
30
- ## 1. Overview
31
-
32
- OmniGen is a unified image generation model that can generate a wide range of images from multi-modal prompts. It is designed to be simple, flexible and easy to use. We provide [inference code](#5-quick-start) so that everyone can explore more functionalities of OmniGen.
33
-
34
- Existing image generation models often require loading several additional network modules (such as ControlNet, IP-Adapter, Reference-Net, etc.) and performing extra preprocessing steps (e.g., face detection, pose estimation, cropping, etc.) to generate a satisfactory image. However, **we believe that the future image generation paradigm should be more simple and flexible, that is, generating various images directly through arbitrarily multi-modal instructions without the need for additional plugins and operations, similar to how GPT works in language generation.**
35
-
36
- Due to the limited resources, OmniGen still has room for improvement. We will continue to optimize it, and hope it inspire more universal image generation models. You can also easily fine-tune OmniGen without worrying about designing networks for specific tasks; you just need to prepare the corresponding data, and then run the [script](#6-finetune). Imagination is no longer limited; everyone can construct any image generation task, and perhaps we can achieve very interesting, wonderful and creative things.
37
-
38
- If you have any questions, ideas or interesting tasks you want OmniGen to accomplish, feel free to discuss with us: 2906698981@qq.com, wangyueze@tju.edu.cn, zhengliu1026@gmail.com. We welcome any feedback to help us improve the model.
39
-
40
-
41
-
42
- ## 2. Credits for Quantized version
43
- - https://github.com/Manni1000
44
-
45
-
46
-
47
- ## 3. Methodology
48
-
49
- You can see details in our [paper](https://arxiv.org/abs/2409.11340).
50
-
51
-
52
- ## 4. What Can OmniGen do?
53
-
54
-
55
- OmniGen is a unified image generation model that you can use to perform various tasks, including but not limited to text-to-image generation, subject-driven generation, Identity-Preserving Generation, image editing, and image-conditioned generation. **OmniGen don't need additional plugins or operations, it can automatically identify the features (e.g., required object, human pose, depth mapping) in input images according the text prompt.**
56
- We showcase some examples in [inference.ipynb](inference.ipynb). And in [inference_demo.ipynb](inference_demo.ipynb), we show an interesting pipeline to generate and modify a image.
57
-
58
- Here is the illustration of OmniGen's capabilities:
59
- - You can control the image generation flexibly via OmniGen
60
- ![demo](./imgs/demo_cases.png)
61
- - Referring Expression Generation: You can generate images by simply referring to objects, and OmniGen will automatically recognize the required objects in the image.
62
- ![demo](./imgs/referring.png)
63
-
64
- If you are not entirely satisfied with certain functionalities or wish to add new capabilities, you can try [fine-tuning OmniGen](#6-finetune).
65
-
66
-
67
-
68
- ## 5. Quick Start
69
-
70
- ### Please refer youtube video for installation
71
-
72
- https://www.youtube.com/watch?v=9ZXmXA2AJZ4
73
-
74
-
75
- ## License
76
- This repo is licensed under the [MIT License](LICENSE).
77
-
78
-
79
- ## Citation
80
- If you find this repository useful, please consider giving a star ⭐ and citation
81
- ```
82
- @article{xiao2024omnigen,
83
- title={Omnigen: Unified image generation},
84
- author={Xiao, Shitao and Wang, Yueze and Zhou, Junjie and Yuan, Huaying and Xing, Xingrun and Yan, Ruiran and Wang, Shuting and Huang, Tiejun and Liu, Zheng},
85
- journal={arXiv preprint arXiv:2409.11340},
86
- year={2024}
87
- }
88
- ```
89
-
90
-
91
-
92
-
93
-
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ title: OmniGen
3
+ emoji: 🦋
4
+ colorFrom: green
5
+ colorTo: blue
6
+ sdk: gradio
7
+ sdk_version: 5.4.0
8
+ app_file: run.py
9
+ pinned: false
10
+ ---
11
+ <h1 align="center">OmniGen: Unified Image Generation</h1>
12
+
13
+
14
+ <p align="center">
15
+ <a href="">
16
+ <img alt="Build" src="https://img.shields.io/badge/Project%20Page-OmniGen-yellow">
17
+ </a>
18
+ <a href="https://arxiv.org/abs/2409.11340">
19
+ <img alt="Build" src="https://img.shields.io/badge/arXiv%20paper-2409.11340-b31b1b.svg">
20
+ </a>
21
+ <a href="https://huggingface.co/spaces/Shitao/OmniGen">
22
+ <img alt="License" src="https://img.shields.io/badge/HF%20Demo-🤗-lightblue">
23
+ </a>
24
+ <a href="https://huggingface.co/Shitao/OmniGen-v1">
25
+ <img alt="Build" src="https://img.shields.io/badge/HF%20Model-🤗-yellow">
26
+ </a>
27
+ </p>
28
+
29
+ <h4 align="center">
30
+ <p>
31
+ <a href=#2-news>Credits for Quantized version</a> |
32
+ <a href=#3-methodology>Methodology</a> |
33
+ <a href=#4-what-can-omnigen-do>Capabilities</a> |
34
+ <a href="#license">License</a> |
35
+ <a href="#citation">Citation</a>
36
+ <p>
37
+ </h4>
38
+
39
+
40
+ ## 1. Overview
41
+
42
+ OmniGen is a unified image generation model that can generate a wide range of images from multi-modal prompts. It is designed to be simple, flexible and easy to use. We provide [inference code](#5-quick-start) so that everyone can explore more functionalities of OmniGen.
43
+
44
+ Existing image generation models often require loading several additional network modules (such as ControlNet, IP-Adapter, Reference-Net, etc.) and performing extra preprocessing steps (e.g., face detection, pose estimation, cropping, etc.) to generate a satisfactory image. However, **we believe that the future image generation paradigm should be more simple and flexible, that is, generating various images directly through arbitrarily multi-modal instructions without the need for additional plugins and operations, similar to how GPT works in language generation.**
45
+
46
+ Due to the limited resources, OmniGen still has room for improvement. We will continue to optimize it, and hope it inspire more universal image generation models. You can also easily fine-tune OmniGen without worrying about designing networks for specific tasks; you just need to prepare the corresponding data, and then run the [script](#6-finetune). Imagination is no longer limited; everyone can construct any image generation task, and perhaps we can achieve very interesting, wonderful and creative things.
47
+
48
+ If you have any questions, ideas or interesting tasks you want OmniGen to accomplish, feel free to discuss with us: 2906698981@qq.com, wangyueze@tju.edu.cn, zhengliu1026@gmail.com. We welcome any feedback to help us improve the model.
49
+
50
+
51
+
52
+ ## 2. Credits for Quantized version
53
+ - https://github.com/Manni1000
54
+
55
+
56
+
57
+ ## 3. Methodology
58
+
59
+ You can see details in our [paper](https://arxiv.org/abs/2409.11340).
60
+
61
+
62
+ ## 4. What Can OmniGen do?
63
+
64
+
65
+ OmniGen is a unified image generation model that you can use to perform various tasks, including but not limited to text-to-image generation, subject-driven generation, Identity-Preserving Generation, image editing, and image-conditioned generation. **OmniGen don't need additional plugins or operations, it can automatically identify the features (e.g., required object, human pose, depth mapping) in input images according the text prompt.**
66
+ We showcase some examples in [inference.ipynb](inference.ipynb). And in [inference_demo.ipynb](inference_demo.ipynb), we show an interesting pipeline to generate and modify a image.
67
+
68
+ Here is the illustration of OmniGen's capabilities:
69
+ - You can control the image generation flexibly via OmniGen
70
+ ![demo](./imgs/demo_cases.png)
71
+ - Referring Expression Generation: You can generate images by simply referring to objects, and OmniGen will automatically recognize the required objects in the image.
72
+ ![demo](./imgs/referring.png)
73
+
74
+ If you are not entirely satisfied with certain functionalities or wish to add new capabilities, you can try [fine-tuning OmniGen](#6-finetune).
75
+
76
+
77
+
78
+ ## 5. Quick Start
79
+
80
+ ### Please refer youtube video for installation
81
+
82
+ https://www.youtube.com/watch?v=9ZXmXA2AJZ4
83
+
84
+
85
+ ## License
86
+ This repo is licensed under the [MIT License](LICENSE).
87
+
88
+
89
+ ## Citation
90
+ If you find this repository useful, please consider giving a star ⭐ and citation
91
+ ```
92
+ @article{xiao2024omnigen,
93
+ title={Omnigen: Unified image generation},
94
+ author={Xiao, Shitao and Wang, Yueze and Zhou, Junjie and Yuan, Huaying and Xing, Xingrun and Yan, Ruiran and Wang, Shuting and Huang, Tiejun and Liu, Zheng},
95
+ journal={arXiv preprint arXiv:2409.11340},
96
+ year={2024}
97
+ }
98
+ ```
99
+
100
+
101
+
102
+
103
+