soft-boy commited on
Commit
affe6d7
1 Parent(s): b0a43a6

Upload folder using huggingface_hub

Browse files
.gitattributes CHANGED
@@ -33,3 +33,6 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
 
 
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ images/control_imgs.png filter=lfs diff=lfs merge=lfs -text
37
+ images/imgs.png filter=lfs diff=lfs merge=lfs -text
38
+ images/intro.png filter=lfs diff=lfs merge=lfs -text
.gitignore ADDED
@@ -0,0 +1,115 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ src
2
+ data
3
+ _backup
4
+
5
+ # Byte-compiled / optimized / DLL files
6
+ __pycache__/
7
+ *.py[cod]
8
+ *$py.class
9
+
10
+ # C extensions
11
+ *.so
12
+
13
+ # Distribution / packaging
14
+ .Python
15
+ build/
16
+ develop-eggs/
17
+ dist/
18
+ downloads/
19
+ eggs/
20
+ .eggs/
21
+ lib/
22
+ lib64/
23
+ parts/
24
+ sdist/
25
+ var/
26
+ wheels/
27
+ *.egg-info/
28
+ .installed.cfg
29
+ *.egg
30
+ MANIFEST
31
+
32
+ # PyInstaller
33
+ # Usually these files are written by a python script from a template
34
+ # before PyInstaller builds the exe, so as to inject date/other infos into it.
35
+ *.manifest
36
+ *.spec
37
+
38
+ # Installer logs
39
+ pip-log.txt
40
+ pip-delete-this-directory.txt
41
+
42
+ # Unit test / coverage reports
43
+ htmlcov/
44
+ .tox/
45
+ .nox/
46
+ .coverage
47
+ .coverage.*
48
+ .cache
49
+ nosetests.xml
50
+ coverage.xml
51
+ *.cover
52
+ .hypothesis/
53
+ .pytest_cache/
54
+
55
+ # Translations
56
+ *.mo
57
+ *.pot
58
+
59
+ # Django stuff:
60
+ *.log
61
+ local_settings.py
62
+ db.sqlite3
63
+
64
+ # Flask stuff:
65
+ instance/
66
+ .webassets-cache
67
+
68
+ # Scrapy stuff:
69
+ .scrapy
70
+
71
+ # Sphinx documentation
72
+ docs/_build/
73
+
74
+ # PyBuilder
75
+ target/
76
+
77
+ # Jupyter Notebook
78
+ .ipynb_checkpoints
79
+
80
+ # IPython
81
+ profile_default/
82
+ ipython_config.py
83
+
84
+ # pyenv
85
+ .python-version
86
+
87
+ # celery beat schedule file
88
+ celerybeat-schedule
89
+
90
+ # SageMath parsed files
91
+ *.sage.py
92
+
93
+ # Environments
94
+ .env
95
+ .venv
96
+ env/
97
+ venv/
98
+ ENV/
99
+ env.bak/
100
+ venv.bak/
101
+
102
+ # Spyder project settings
103
+ .spyderproject
104
+ .spyproject
105
+
106
+ # Rope project settings
107
+ .ropeproject
108
+
109
+ # mkdocs documentation
110
+ /site
111
+
112
+ # mypy
113
+ .mypy_cache/
114
+ .dmypy.json
115
+ dmypy.json
LICENSE ADDED
@@ -0,0 +1,201 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ Apache License
2
+ Version 2.0, January 2004
3
+ http://www.apache.org/licenses/
4
+
5
+ TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
6
+
7
+ 1. Definitions.
8
+
9
+ "License" shall mean the terms and conditions for use, reproduction,
10
+ and distribution as defined by Sections 1 through 9 of this document.
11
+
12
+ "Licensor" shall mean the copyright owner or entity authorized by
13
+ the copyright owner that is granting the License.
14
+
15
+ "Legal Entity" shall mean the union of the acting entity and all
16
+ other entities that control, are controlled by, or are under common
17
+ control with that entity. For the purposes of this definition,
18
+ "control" means (i) the power, direct or indirect, to cause the
19
+ direction or management of such entity, whether by contract or
20
+ otherwise, or (ii) ownership of fifty percent (50%) or more of the
21
+ outstanding shares, or (iii) beneficial ownership of such entity.
22
+
23
+ "You" (or "Your") shall mean an individual or Legal Entity
24
+ exercising permissions granted by this License.
25
+
26
+ "Source" form shall mean the preferred form for making modifications,
27
+ including but not limited to software source code, documentation
28
+ source, and configuration files.
29
+
30
+ "Object" form shall mean any form resulting from mechanical
31
+ transformation or translation of a Source form, including but
32
+ not limited to compiled object code, generated documentation,
33
+ and conversions to other media types.
34
+
35
+ "Work" shall mean the work of authorship, whether in Source or
36
+ Object form, made available under the License, as indicated by a
37
+ copyright notice that is included in or attached to the work
38
+ (an example is provided in the Appendix below).
39
+
40
+ "Derivative Works" shall mean any work, whether in Source or Object
41
+ form, that is based on (or derived from) the Work and for which the
42
+ editorial revisions, annotations, elaborations, or other modifications
43
+ represent, as a whole, an original work of authorship. For the purposes
44
+ of this License, Derivative Works shall not include works that remain
45
+ separable from, or merely link (or bind by name) to the interfaces of,
46
+ the Work and Derivative Works thereof.
47
+
48
+ "Contribution" shall mean any work of authorship, including
49
+ the original version of the Work and any modifications or additions
50
+ to that Work or Derivative Works thereof, that is intentionally
51
+ submitted to Licensor for inclusion in the Work by the copyright owner
52
+ or by an individual or Legal Entity authorized to submit on behalf of
53
+ the copyright owner. For the purposes of this definition, "submitted"
54
+ means any form of electronic, verbal, or written communication sent
55
+ to the Licensor or its representatives, including but not limited to
56
+ communication on electronic mailing lists, source code control systems,
57
+ and issue tracking systems that are managed by, or on behalf of, the
58
+ Licensor for the purpose of discussing and improving the Work, but
59
+ excluding communication that is conspicuously marked or otherwise
60
+ designated in writing by the copyright owner as "Not a Contribution."
61
+
62
+ "Contributor" shall mean Licensor and any individual or Legal Entity
63
+ on behalf of whom a Contribution has been received by Licensor and
64
+ subsequently incorporated within the Work.
65
+
66
+ 2. Grant of Copyright License. Subject to the terms and conditions of
67
+ this License, each Contributor hereby grants to You a perpetual,
68
+ worldwide, non-exclusive, no-charge, royalty-free, irrevocable
69
+ copyright license to reproduce, prepare Derivative Works of,
70
+ publicly display, publicly perform, sublicense, and distribute the
71
+ Work and such Derivative Works in Source or Object form.
72
+
73
+ 3. Grant of Patent License. Subject to the terms and conditions of
74
+ this License, each Contributor hereby grants to You a perpetual,
75
+ worldwide, non-exclusive, no-charge, royalty-free, irrevocable
76
+ (except as stated in this section) patent license to make, have made,
77
+ use, offer to sell, sell, import, and otherwise transfer the Work,
78
+ where such license applies only to those patent claims licensable
79
+ by such Contributor that are necessarily infringed by their
80
+ Contribution(s) alone or by combination of their Contribution(s)
81
+ with the Work to which such Contribution(s) was submitted. If You
82
+ institute patent litigation against any entity (including a
83
+ cross-claim or counterclaim in a lawsuit) alleging that the Work
84
+ or a Contribution incorporated within the Work constitutes direct
85
+ or contributory patent infringement, then any patent licenses
86
+ granted to You under this License for that Work shall terminate
87
+ as of the date such litigation is filed.
88
+
89
+ 4. Redistribution. You may reproduce and distribute copies of the
90
+ Work or Derivative Works thereof in any medium, with or without
91
+ modifications, and in Source or Object form, provided that You
92
+ meet the following conditions:
93
+
94
+ (a) You must give any other recipients of the Work or
95
+ Derivative Works a copy of this License; and
96
+
97
+ (b) You must cause any modified files to carry prominent notices
98
+ stating that You changed the files; and
99
+
100
+ (c) You must retain, in the Source form of any Derivative Works
101
+ that You distribute, all copyright, patent, trademark, and
102
+ attribution notices from the Source form of the Work,
103
+ excluding those notices that do not pertain to any part of
104
+ the Derivative Works; and
105
+
106
+ (d) If the Work includes a "NOTICE" text file as part of its
107
+ distribution, then any Derivative Works that You distribute must
108
+ include a readable copy of the attribution notices contained
109
+ within such NOTICE file, excluding those notices that do not
110
+ pertain to any part of the Derivative Works, in at least one
111
+ of the following places: within a NOTICE text file distributed
112
+ as part of the Derivative Works; within the Source form or
113
+ documentation, if provided along with the Derivative Works; or,
114
+ within a display generated by the Derivative Works, if and
115
+ wherever such third-party notices normally appear. The contents
116
+ of the NOTICE file are for informational purposes only and
117
+ do not modify the License. You may add Your own attribution
118
+ notices within Derivative Works that You distribute, alongside
119
+ or as an addendum to the NOTICE text from the Work, provided
120
+ that such additional attribution notices cannot be construed
121
+ as modifying the License.
122
+
123
+ You may add Your own copyright statement to Your modifications and
124
+ may provide additional or different license terms and conditions
125
+ for use, reproduction, or distribution of Your modifications, or
126
+ for any such Derivative Works as a whole, provided Your use,
127
+ reproduction, and distribution of the Work otherwise complies with
128
+ the conditions stated in this License.
129
+
130
+ 5. Submission of Contributions. Unless You explicitly state otherwise,
131
+ any Contribution intentionally submitted for inclusion in the Work
132
+ by You to the Licensor shall be under the terms and conditions of
133
+ this License, without any additional terms or conditions.
134
+ Notwithstanding the above, nothing herein shall supersede or modify
135
+ the terms of any separate license agreement you may have executed
136
+ with Licensor regarding such Contributions.
137
+
138
+ 6. Trademarks. This License does not grant permission to use the trade
139
+ names, trademarks, service marks, or product names of the Licensor,
140
+ except as required for reasonable and customary use in describing the
141
+ origin of the Work and reproducing the content of the NOTICE file.
142
+
143
+ 7. Disclaimer of Warranty. Unless required by applicable law or
144
+ agreed to in writing, Licensor provides the Work (and each
145
+ Contributor provides its Contributions) on an "AS IS" BASIS,
146
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
147
+ implied, including, without limitation, any warranties or conditions
148
+ of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
149
+ PARTICULAR PURPOSE. You are solely responsible for determining the
150
+ appropriateness of using or redistributing the Work and assume any
151
+ risks associated with Your exercise of permissions under this License.
152
+
153
+ 8. Limitation of Liability. In no event and under no legal theory,
154
+ whether in tort (including negligence), contract, or otherwise,
155
+ unless required by applicable law (such as deliberate and grossly
156
+ negligent acts) or agreed to in writing, shall any Contributor be
157
+ liable to You for damages, including any direct, indirect, special,
158
+ incidental, or consequential damages of any character arising as a
159
+ result of this License or out of the use or inability to use the
160
+ Work (including but not limited to damages for loss of goodwill,
161
+ work stoppage, computer failure or malfunction, or any and all
162
+ other commercial damages or losses), even if such Contributor
163
+ has been advised of the possibility of such damages.
164
+
165
+ 9. Accepting Warranty or Additional Liability. While redistributing
166
+ the Work or Derivative Works thereof, You may choose to offer,
167
+ and charge a fee for, acceptance of support, warranty, indemnity,
168
+ or other liability obligations and/or rights consistent with this
169
+ License. However, in accepting such obligations, You may act only
170
+ on Your own behalf and on Your sole responsibility, not on behalf
171
+ of any other Contributor, and only if You agree to indemnify,
172
+ defend, and hold each Contributor harmless for any liability
173
+ incurred by, or claims asserted against, such Contributor by reason
174
+ of your accepting any such warranty or additional liability.
175
+
176
+ END OF TERMS AND CONDITIONS
177
+
178
+ APPENDIX: How to apply the Apache License to your work.
179
+
180
+ To apply the Apache License to your work, attach the following
181
+ boilerplate notice, with the fields enclosed by brackets "[]"
182
+ replaced with your own identifying information. (Don't include
183
+ the brackets!) The text should be enclosed in the appropriate
184
+ comment syntax for the file format. We also recommend that a
185
+ file or class name and description of purpose be included on the
186
+ same "printed page" as the copyright notice for easier
187
+ identification within third-party archives.
188
+
189
+ Copyright [yyyy] [name of copyright owner]
190
+
191
+ Licensed under the Apache License, Version 2.0 (the "License");
192
+ you may not use this file except in compliance with the License.
193
+ You may obtain a copy of the License at
194
+
195
+ http://www.apache.org/licenses/LICENSE-2.0
196
+
197
+ Unless required by applicable law or agreed to in writing, software
198
+ distributed under the License is distributed on an "AS IS" BASIS,
199
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
200
+ See the License for the specific language governing permissions and
201
+ limitations under the License.
README.md CHANGED
@@ -1,12 +1,132 @@
1
  ---
2
- title: Sdxs
3
- emoji: 🌖
4
- colorFrom: yellow
5
- colorTo: blue
6
  sdk: gradio
7
- sdk_version: 5.4.0
8
- app_file: app.py
9
- pinned: false
10
  ---
 
11
 
12
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ title: sdxs
3
+ app_file: demo_sketch.py
 
 
4
  sdk: gradio
5
+ sdk_version: 3.43.1
 
 
6
  ---
7
+ <div align="center">
8
 
9
+ ## SDXS: Real-Time One-Step Latent Diffusion Models with Image Conditions
10
+
11
+ [![Project](https://img.shields.io/badge/Home-Project-green?logo=Houzz&logoColor=white)](https://idkiro.github.io/sdxs)
12
+ [![Paper](https://img.shields.io/badge/arxiv-Paper-blue?logo=arxiv)](https://arxiv.org/abs/2403.16627)
13
+ [![SDXS-512-0.9](https://img.shields.io/badge/🤗Model-512--0.9-gold)](https://huggingface.co/IDKiro/sdxs-512-0.9)
14
+ [![SDXS-512-DreamShaper](https://img.shields.io/badge/🤗Model-512--DreamShaper-gold)](https://huggingface.co/IDKiro/sdxs-512-dreamshaper)
15
+ [![SDXS-512-DreamShaper-Anime](https://img.shields.io/badge/🤗Model-512--DreamShaper--Anime-gold)](https://huggingface.co/IDKiro/sdxs-512-dreamshaper-anime)
16
+ [![SDXS-512-DreamShaper-Sketch](https://img.shields.io/badge/🤗Model-512--DreamShaper--Sketch-gold)](https://huggingface.co/IDKiro/sdxs-512-dreamshaper-sketch)
17
+ [![SDXS-512-DreamShaper-Demo](https://img.shields.io/badge/🤗Demo-Text2Image-pink)](https://huggingface.co/spaces/IDKiro/SDXS-512-DreamShaper)
18
+ [![SDXS-512-DreamShaper-Anime-Demo](https://img.shields.io/badge/🤗Demo-Text2Image--Anime-pink)](https://huggingface.co/spaces/IDKiro/SDXS-512-DreamShaper-Anime)
19
+ [![SDXS-512-DreamShaper-Sketch-Demo](https://img.shields.io/badge/🤗Demo-Sketch2Image-pink)](https://huggingface.co/spaces/IDKiro/SDXS-512-DreamShaper-Sketch)
20
+
21
+
22
+ *Yuda Song, Zehao Sun, Xuanwu Yin*
23
+
24
+ </div>
25
+
26
+ We present two models, SDXS-512 and SDXS-1024, achieving inference speeds of approximately <b>100 FPS</b> (30x faster than SD v1.5) and <b>30 FPS</b> (60x faster than SDXL) on a single GPU. Assuming the image generation time is limited to <b>1 second</b>, then SDXL can only use 16 NFEs to produce a slightly blurry image, while SDXS-1024 can generate 30 clear images.
27
+
28
+ ![](images/intro.png)
29
+
30
+ Moreover, our proposed method can also train ControlNet, offering promising applications in image-conditioned control and facilitating efficient image-to-image translation.
31
+
32
+ <p align="left" >
33
+ <img src="images\sketch.gif" width="800" />
34
+ </p>
35
+
36
+ ## 🔥News
37
+
38
+ - **April 11, 2024:** [SDXS-512-DreamShaper-Anime](https://huggingface.co/IDKiro/sdxs-512-dreamshaper-anime) is released. We also create some Gradio demo on Hugging Face.
39
+ - **April 10, 2024:** [SDXS-512-DreamShaper](https://huggingface.co/IDKiro/sdxs-512-dreamshaper) and [SDXS-512-DreamShaper-Sketch](https://huggingface.co/IDKiro/sdxs-512-dreamshaper-sketch) are released. We also upload our demo code.
40
+ - **March 25, 2024:** [SDXS-512-0.9](https://huggingface.co/IDKiro/sdxs-512-0.9) is released, it is an old version of SDXS-512.
41
+
42
+ ## ⚡️Demo
43
+
44
+ Create a new environment:
45
+
46
+ ```sh
47
+ conda create -n sdxs
48
+ ```
49
+
50
+ Activate the new environment:
51
+
52
+ ```sh
53
+ conda activate sdxs
54
+ ```
55
+
56
+ Install requirements:
57
+
58
+ ```sh
59
+ conda install python=3.10 pytorch=2.2.1 torchvision torchaudio pytorch-cuda=11.8 xformers=0.0.25 -c pytorch -c nvidia -c xformers
60
+ pip install -r requirements.txt
61
+ ```
62
+
63
+ Run text-to-image demo:
64
+
65
+ ```sh
66
+ python demo.py
67
+ ```
68
+
69
+ Run anime-style text-to-image (LoRA) demo:
70
+
71
+ ```sh
72
+ python demo_anime.py
73
+ ```
74
+
75
+ Run sketch-to-image (ControlNet) demo:
76
+
77
+ ```sh
78
+ python demo_sketch.py
79
+ ```
80
+
81
+ ## 💡Train
82
+
83
+ I found that [DMD2](https://github.com/tianweiy/DMD2) release the training code, and its training scheme is identical to the new version of SDXS, so you can refer to it.
84
+ Unfortunately, the SDXS training code is not allowed to be open-sourced and will most likely not be updated again.
85
+
86
+ ## ✒️Method
87
+
88
+ ### Model Acceleration
89
+
90
+ We train an extremely light-weight image decoder to mimic the original VAE decoder’s output through a combination of output distillation loss and GAN loss. We also leverage the block removal distillation strategy to efficiently transfer the knowledge from the original U-Net to a more compact version.
91
+
92
+ ![](images/method1.png)
93
+
94
+ SDXS demonstrates efficiency far surpassing that of the base models, even achieving image generation at 100 FPS for 512x512 images and 30 FPS for 1024x1024 images on the GPU.
95
+
96
+ ![](images/speed.png)
97
+
98
+ ### Text-to-Image
99
+
100
+ To reduce the NFEs, we suggest straightening the sampling trajectory and quickly finetuning the multi-step model into a one-step model by replacing the distillation loss function with the proposed feature matching loss. Then, we extend the Diff-Instruct training strategy, using the gradient of the proposed feature matching loss to replace the gradient provided by score distillation in the latter half of the timestep.
101
+
102
+ ![](images/method2.png)
103
+
104
+ Despite a noticeable downsizing in both the sizes of the models and the number of sampling steps required, the prompt-following capability of SDXS-512 remains superior to that of SD v1.5. This observation is consistently validated in the performance of SDXS-1024 as well.
105
+
106
+ ![](images/imgs.png)
107
+
108
+ ### Image-to-Image
109
+
110
+ We extend our proposed training strategy to the training of ControlNet, relying on adding the pretrained ControlNet to the score function.
111
+
112
+ ![](images/method3.png)
113
+
114
+ We demonstrate its efficacy in facilitating image-to-image conversions utilizing ControlNet, specifically for transformations involving canny edges and depth maps.
115
+
116
+ ![](images/control_imgs.png)
117
+
118
+
119
+ ## Citation
120
+
121
+ If you find this work useful for your research, please cite our paper:
122
+
123
+ ```bibtex
124
+ @article{song2024sdxs,
125
+ author = {Yuda Song, Zehao Sun, Xuanwu Yin},
126
+ title = {SDXS: Real-Time One-Step Latent Diffusion Models with Image Conditions},
127
+ journal = {arxiv},
128
+ year = {2024},
129
+ }
130
+ ```
131
+
132
+ **Acknowledgment**: the demo code is based on https://github.com/GaParmar/img2img-turbo.
demo.py ADDED
@@ -0,0 +1,117 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import base64
2
+ from io import BytesIO
3
+
4
+ import gradio as gr
5
+ import PIL.Image
6
+ import torch
7
+ from diffusers import StableDiffusionPipeline, AutoencoderKL, AutoencoderTiny
8
+
9
+ device = "mps" # Linux & Windows
10
+ weight_type = torch.float16 # torch.float16 works as well, but pictures seem to be a bit worse
11
+
12
+ pipe = StableDiffusionPipeline.from_pretrained("IDKiro/sdxs-512-dreamshaper", torch_dtype=weight_type)
13
+ pipe.to(torch_device=device, torch_dtype=weight_type)
14
+
15
+ vae_tiny = AutoencoderTiny.from_pretrained("IDKiro/sdxs-512-dreamshaper", subfolder="vae")
16
+ vae_tiny.to(device, dtype=weight_type)
17
+
18
+ vae_large = AutoencoderKL.from_pretrained("IDKiro/sdxs-512-dreamshaper", subfolder="vae_large")
19
+ vae_tiny.to(device, dtype=weight_type)
20
+
21
+ def pil_image_to_data_url(img, format="PNG"):
22
+ buffered = BytesIO()
23
+ img.save(buffered, format=format)
24
+ img_str = base64.b64encode(buffered.getvalue()).decode()
25
+ return f"data:image/{format.lower()};base64,{img_str}"
26
+
27
+
28
+ def run(
29
+ prompt: str,
30
+ device_type="GPU",
31
+ vae_type=None,
32
+ param_dtype='torch.float16',
33
+ ) -> PIL.Image.Image:
34
+ if vae_type == "tiny vae":
35
+ pipe.vae = vae_tiny
36
+ elif vae_type == "large vae":
37
+ pipe.vae = vae_large
38
+
39
+ if device_type == "CPU":
40
+ device = "cpu"
41
+ param_dtype = 'torch.float32'
42
+ else:
43
+ device = "cuda"
44
+
45
+ pipe.to(torch_device=device, torch_dtype=torch.float16 if param_dtype == 'torch.float16' else torch.float32)
46
+
47
+ result = pipe(
48
+ prompt=prompt,
49
+ guidance_scale=0.0,
50
+ num_inference_steps=1,
51
+ output_type="pil",
52
+ ).images[0]
53
+
54
+ result_url = pil_image_to_data_url(result)
55
+
56
+ return (result, result_url)
57
+
58
+
59
+ examples = [
60
+ "A photo of beautiful mountain with realistic sunset and blue lake, highly detailed, masterpiece",
61
+ ]
62
+
63
+ with gr.Blocks(css="style.css") as demo:
64
+ gr.Markdown("# SDXS-512-DreamShaper")
65
+ with gr.Group():
66
+ with gr.Row():
67
+ with gr.Column(min_width=685):
68
+ with gr.Row():
69
+ prompt = gr.Text(
70
+ label="Prompt",
71
+ show_label=False,
72
+ max_lines=1,
73
+ placeholder="Enter your prompt",
74
+ container=False,
75
+ )
76
+ run_button = gr.Button("Run", scale=0)
77
+
78
+ device_choices = ['GPU','CPU']
79
+ device_type = gr.Radio(device_choices, label='Device',
80
+ value=device_choices[0],
81
+ interactive=True,
82
+ info='Please choose GPU if you have a GPU.')
83
+
84
+ vae_choices = ['tiny vae','large vae']
85
+ vae_type = gr.Radio(vae_choices, label='Image Decoder Type',
86
+ value=vae_choices[0],
87
+ interactive=True,
88
+ info='To save GPU memory, use tiny vae. For better quality, use large vae.')
89
+
90
+ dtype_choices = ['torch.float16','torch.float32']
91
+ param_dtype = gr.Radio(dtype_choices,label='torch.weight_type',
92
+ value=dtype_choices[0],
93
+ interactive=True,
94
+ info='To save GPU memory, use torch.float16. For better quality, use torch.float32.')
95
+
96
+ download_output = gr.Button("Download output", elem_id="download_output")
97
+
98
+ with gr.Column(min_width=512):
99
+ result = gr.Image(label="Result", height=512, width=512, elem_id="output_image", show_label=False, show_download_button=True)
100
+
101
+ gr.Examples(
102
+ examples=examples,
103
+ inputs=prompt,
104
+ outputs=result,
105
+ fn=run
106
+ )
107
+
108
+ demo.load(None,None,None)
109
+
110
+ inputs = [prompt, device_type, vae_type, param_dtype]
111
+ outputs = [result, download_output]
112
+ prompt.submit(fn=run, inputs=inputs, outputs=outputs)
113
+ run_button.click(fn=run, inputs=inputs, outputs=outputs)
114
+
115
+ if __name__ == "__main__":
116
+ # demo.queue().launch(debug=True, server_port=8080)
117
+ demo.queue().launch(debug=True, server_port=8080)
demo_anime.py ADDED
@@ -0,0 +1,119 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import base64
2
+ from io import BytesIO
3
+
4
+ import gradio as gr
5
+ import PIL.Image
6
+ import torch
7
+
8
+ from diffusers import StableDiffusionPipeline, AutoencoderKL, AutoencoderTiny
9
+ from peft import PeftModel
10
+
11
+ device = "cuda" # Linux & Windows
12
+ weight_type = torch.float16 # torch.float16 works as well, but pictures seem to be a bit worse
13
+
14
+ pipe = StableDiffusionPipeline.from_pretrained("IDKiro/sdxs-512-dreamshaper", torch_dtype=weight_type)
15
+ pipe.unet = PeftModel.from_pretrained(pipe.unet, "IDKiro/sdxs-512-dreamshaper-anime")
16
+ pipe.to(torch_device=device, torch_dtype=weight_type)
17
+
18
+ vae_tiny = AutoencoderTiny.from_pretrained("IDKiro/sdxs-512-dreamshaper", subfolder="vae")
19
+ vae_tiny.to(device, dtype=weight_type)
20
+
21
+ vae_large = AutoencoderKL.from_pretrained("IDKiro/sdxs-512-dreamshaper", subfolder="vae_large")
22
+ vae_tiny.to(device, dtype=weight_type)
23
+
24
+ def pil_image_to_data_url(img, format="PNG"):
25
+ buffered = BytesIO()
26
+ img.save(buffered, format=format)
27
+ img_str = base64.b64encode(buffered.getvalue()).decode()
28
+ return f"data:image/{format.lower()};base64,{img_str}"
29
+
30
+
31
+ def run(
32
+ prompt: str,
33
+ device_type="GPU",
34
+ vae_type=None,
35
+ param_dtype='torch.float16',
36
+ ) -> PIL.Image.Image:
37
+ if vae_type == "tiny vae":
38
+ pipe.vae = vae_tiny
39
+ elif vae_type == "large vae":
40
+ pipe.vae = vae_large
41
+
42
+ if device_type == "CPU":
43
+ device = "cpu"
44
+ param_dtype = 'torch.float32'
45
+ else:
46
+ device = "cuda"
47
+
48
+ pipe.to(torch_device=device, torch_dtype=torch.float16 if param_dtype == 'torch.float16' else torch.float32)
49
+
50
+ result = pipe(
51
+ prompt=prompt,
52
+ guidance_scale=0.0,
53
+ num_inference_steps=1,
54
+ output_type="pil",
55
+ ).images[0]
56
+
57
+ result_url = pil_image_to_data_url(result)
58
+
59
+ return (result, result_url)
60
+
61
+
62
+ examples = [
63
+ "Self-portrait oil painting, a beautiful cyborg with golden hair, 8k",
64
+ ]
65
+
66
+ with gr.Blocks(css="style.css") as demo:
67
+ gr.Markdown("# SDXS-512-DreamShaper Anime")
68
+ with gr.Group():
69
+ with gr.Row():
70
+ with gr.Column(min_width=685):
71
+ with gr.Row():
72
+ prompt = gr.Text(
73
+ label="Prompt",
74
+ show_label=False,
75
+ max_lines=1,
76
+ placeholder="Enter your prompt",
77
+ container=False,
78
+ )
79
+ run_button = gr.Button("Run", scale=0)
80
+
81
+ device_choices = ['GPU','CPU']
82
+ device_type = gr.Radio(device_choices, label='Device',
83
+ value=device_choices[0],
84
+ interactive=True,
85
+ info='Please choose GPU if you have a GPU.')
86
+
87
+ vae_choices = ['tiny vae','large vae']
88
+ vae_type = gr.Radio(vae_choices, label='Image Decoder Type',
89
+ value=vae_choices[0],
90
+ interactive=True,
91
+ info='To save GPU memory, use tiny vae. For better quality, use large vae.')
92
+
93
+ dtype_choices = ['torch.float16','torch.float32']
94
+ param_dtype = gr.Radio(dtype_choices,label='torch.weight_type',
95
+ value=dtype_choices[0],
96
+ interactive=True,
97
+ info='To save GPU memory, use torch.float16. For better quality, use torch.float32.')
98
+
99
+ download_output = gr.Button("Download output", elem_id="download_output")
100
+
101
+ with gr.Column(min_width=512):
102
+ result = gr.Image(label="Result", height=512, width=512, elem_id="output_image", show_label=False, show_download_button=True)
103
+
104
+ gr.Examples(
105
+ examples=examples,
106
+ inputs=prompt,
107
+ outputs=result,
108
+ fn=run
109
+ )
110
+
111
+ demo.load(None,None,None)
112
+
113
+ inputs = [prompt, device_type, vae_type, param_dtype]
114
+ outputs = [result, download_output]
115
+ prompt.submit(fn=run, inputs=inputs, outputs=outputs)
116
+ run_button.click(fn=run, inputs=inputs, outputs=outputs)
117
+
118
+ if __name__ == "__main__":
119
+ demo.queue().launch(debug=True)
demo_sketch.py ADDED
@@ -0,0 +1,324 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import random
2
+ import numpy as np
3
+ from PIL import Image
4
+ import base64
5
+ from io import BytesIO
6
+
7
+ import torch
8
+ import torchvision.transforms.functional as F
9
+ from diffusers import ControlNetModel, StableDiffusionControlNetPipeline
10
+ import gradio as gr
11
+
12
+ device = "mps" # Linux & Windows
13
+ weight_type = torch.float16 # torch.float16 works as well, but pictures seem to be a bit worse
14
+
15
+ controlnet = ControlNetModel.from_pretrained(
16
+ "IDKiro/sdxs-512-dreamshaper-sketch", torch_dtype=weight_type
17
+ ).to(device)
18
+ pipe = StableDiffusionControlNetPipeline.from_pretrained(
19
+ "IDKiro/sdxs-512-dreamshaper", controlnet=controlnet, torch_dtype=weight_type
20
+ )
21
+ pipe.to(device)
22
+
23
+ style_list = [
24
+ {
25
+ "name": "No Style",
26
+ "prompt": "{prompt}",
27
+ },
28
+ {
29
+ "name": "Cinematic",
30
+ "prompt": "cinematic still {prompt} . emotional, harmonious, vignette, highly detailed, high budget, bokeh, cinemascope, moody, epic, gorgeous, film grain, grainy",
31
+ },
32
+ {
33
+ "name": "3D Model",
34
+ "prompt": "professional 3d model {prompt} . octane render, highly detailed, volumetric, dramatic lighting",
35
+ },
36
+ {
37
+ "name": "Anime",
38
+ "prompt": "anime artwork {prompt} . anime style, key visual, vibrant, studio anime, highly detailed",
39
+ },
40
+ {
41
+ "name": "Digital Art",
42
+ "prompt": "concept art {prompt} . digital artwork, illustrative, painterly, matte painting, highly detailed",
43
+ },
44
+ {
45
+ "name": "Photographic",
46
+ "prompt": "cinematic photo {prompt} . 35mm photograph, film, bokeh, professional, 4k, highly detailed",
47
+ },
48
+ {
49
+ "name": "Pixel art",
50
+ "prompt": "pixel-art {prompt} . low-res, blocky, pixel art style, 8-bit graphics",
51
+ },
52
+ {
53
+ "name": "Fantasy art",
54
+ "prompt": "ethereal fantasy concept art of {prompt} . magnificent, celestial, ethereal, painterly, epic, majestic, magical, fantasy art, cover art, dreamy",
55
+ },
56
+ {
57
+ "name": "Neonpunk",
58
+ "prompt": "neonpunk style {prompt} . cyberpunk, vaporwave, neon, vibes, vibrant, stunningly beautiful, crisp, detailed, sleek, ultramodern, magenta highlights, dark purple shadows, high contrast, cinematic, ultra detailed, intricate, professional",
59
+ },
60
+ {
61
+ "name": "Manga",
62
+ "prompt": "manga style {prompt} . vibrant, high-energy, detailed, iconic, Japanese comic style",
63
+ },
64
+ ]
65
+
66
+ styles = {k["name"]: k["prompt"] for k in style_list}
67
+ STYLE_NAMES = list(styles.keys())
68
+ DEFAULT_STYLE_NAME = "No Style"
69
+ MAX_SEED = np.iinfo(np.int32).max
70
+
71
+
72
+ def pil_image_to_data_url(img, format="PNG"):
73
+ buffered = BytesIO()
74
+ img.save(buffered, format=format)
75
+ img_str = base64.b64encode(buffered.getvalue()).decode()
76
+ return f"data:image/{format.lower()};base64,{img_str}"
77
+
78
+
79
+ def randomize_seed_fn(seed: int, randomize_seed: bool) -> int:
80
+ if randomize_seed:
81
+ seed = random.randint(0, MAX_SEED)
82
+ return seed
83
+
84
+
85
+ def run(
86
+ image,
87
+ prompt,
88
+ prompt_template,
89
+ style_name,
90
+ controlnet_conditioning_scale,
91
+ device_type="GPU",
92
+ param_dtype='torch.float16',
93
+ ):
94
+ if device_type == "CPU":
95
+ device = "cpu"
96
+ param_dtype = 'torch.float32'
97
+ else:
98
+ device = "mps"
99
+
100
+ pipe.to(torch_device=device, torch_dtype=torch.float16 if param_dtype == 'torch.float16' else torch.float32)
101
+
102
+ print(f"prompt: {prompt}")
103
+ print("sketch updated")
104
+ if image is None:
105
+ ones = Image.new("L", (512, 512), 255)
106
+ temp_url = pil_image_to_data_url(ones)
107
+ return ones, gr.update(link=temp_url), gr.update(link=temp_url)
108
+ prompt = prompt_template.replace("{prompt}", prompt)
109
+ control_image = image.convert("RGB")
110
+ control_image = Image.fromarray(255 - np.array(control_image))
111
+
112
+ output_pil = pipe(
113
+ prompt=prompt,
114
+ image=control_image,
115
+ width=512,
116
+ height=512,
117
+ guidance_scale=0.0,
118
+ num_inference_steps=1,
119
+ num_images_per_prompt=1,
120
+ output_type="pil",
121
+ controlnet_conditioning_scale=controlnet_conditioning_scale,
122
+ ).images[0]
123
+
124
+ input_sketch_url = pil_image_to_data_url(control_image)
125
+ output_image_url = pil_image_to_data_url(output_pil)
126
+ return (
127
+ output_pil,
128
+ gr.update(link=input_sketch_url),
129
+ gr.update(link=output_image_url),
130
+ )
131
+
132
+
133
+ def update_canvas(use_line, use_eraser):
134
+ if use_eraser:
135
+ _color = "#ffffff"
136
+ brush_size = 20
137
+ if use_line:
138
+ _color = "#000000"
139
+ brush_size = 8
140
+ return gr.update(brush_radius=brush_size, brush_color=_color, interactive=True)
141
+
142
+
143
+ def upload_sketch(file):
144
+ _img = Image.open(file.name)
145
+ _img = _img.convert("L")
146
+ return gr.update(value=_img, source="upload", interactive=True)
147
+
148
+
149
+ scripts = """
150
+ async () => {
151
+ globalThis.theSketchDownloadFunction = () => {
152
+ console.log("test")
153
+ var link = document.createElement("a");
154
+ dataUrl = document.getElementById('download_sketch').href
155
+ link.setAttribute("href", dataUrl)
156
+ link.setAttribute("download", "sketch.png")
157
+ document.body.appendChild(link); // Required for Firefox
158
+ link.click();
159
+ document.body.removeChild(link); // Clean up
160
+
161
+ // also call the output download function
162
+ theOutputDownloadFunction();
163
+ return false
164
+ }
165
+
166
+ globalThis.theOutputDownloadFunction = () => {
167
+ console.log("test output download function")
168
+ var link = document.createElement("a");
169
+ dataUrl = document.getElementById('download_output').href
170
+ link.setAttribute("href", dataUrl);
171
+ link.setAttribute("download", "output.png");
172
+ document.body.appendChild(link); // Required for Firefox
173
+ link.click();
174
+ document.body.removeChild(link); // Clean up
175
+ return false
176
+ }
177
+
178
+ globalThis.UNDO_SKETCH_FUNCTION = () => {
179
+ console.log("undo sketch function")
180
+ var button_undo = document.querySelector('#input_image > div.image-container.svelte-p3y7hu > div.svelte-s6ybro > button:nth-child(1)');
181
+ // Create a new 'click' event
182
+ var event = new MouseEvent('click', {
183
+ 'view': window,
184
+ 'bubbles': true,
185
+ 'cancelable': true
186
+ });
187
+ button_undo.dispatchEvent(event);
188
+ }
189
+
190
+ globalThis.DELETE_SKETCH_FUNCTION = () => {
191
+ console.log("delete sketch function")
192
+ var button_del = document.querySelector('#input_image > div.image-container.svelte-p3y7hu > div.svelte-s6ybro > button:nth-child(2)');
193
+ // Create a new 'click' event
194
+ var event = new MouseEvent('click', {
195
+ 'view': window,
196
+ 'bubbles': true,
197
+ 'cancelable': true
198
+ });
199
+ button_del.dispatchEvent(event);
200
+ }
201
+
202
+ globalThis.togglePencil = () => {
203
+ el_pencil = document.getElementById('my-toggle-pencil');
204
+ el_pencil.classList.toggle('clicked');
205
+ // simulate a click on the gradio button
206
+ btn_gradio = document.querySelector("#cb-line > label > input");
207
+ var event = new MouseEvent('click', {
208
+ 'view': window,
209
+ 'bubbles': true,
210
+ 'cancelable': true
211
+ });
212
+ btn_gradio.dispatchEvent(event);
213
+ if (el_pencil.classList.contains('clicked')) {
214
+ document.getElementById('my-toggle-eraser').classList.remove('clicked');
215
+ document.getElementById('my-div-pencil').style.backgroundColor = "gray";
216
+ document.getElementById('my-div-eraser').style.backgroundColor = "white";
217
+ }
218
+ else {
219
+ document.getElementById('my-toggle-eraser').classList.add('clicked');
220
+ document.getElementById('my-div-pencil').style.backgroundColor = "white";
221
+ document.getElementById('my-div-eraser').style.backgroundColor = "gray";
222
+ }
223
+
224
+ }
225
+
226
+ globalThis.toggleEraser = () => {
227
+ element = document.getElementById('my-toggle-eraser');
228
+ element.classList.toggle('clicked');
229
+ // simulate a click on the gradio button
230
+ btn_gradio = document.querySelector("#cb-eraser > label > input");
231
+ var event = new MouseEvent('click', {
232
+ 'view': window,
233
+ 'bubbles': true,
234
+ 'cancelable': true
235
+ });
236
+ btn_gradio.dispatchEvent(event);
237
+ if (element.classList.contains('clicked')) {
238
+ document.getElementById('my-toggle-pencil').classList.remove('clicked');
239
+ document.getElementById('my-div-pencil').style.backgroundColor = "white";
240
+ document.getElementById('my-div-eraser').style.backgroundColor = "gray";
241
+ }
242
+ else {
243
+ document.getElementById('my-toggle-pencil').classList.add('clicked');
244
+ document.getElementById('my-div-pencil').style.backgroundColor = "gray";
245
+ document.getElementById('my-div-eraser').style.backgroundColor = "white";
246
+ }
247
+ }
248
+ }
249
+ """
250
+
251
+ with gr.Blocks(css="style.css") as demo:
252
+ gr.Markdown("# SDXS-512-DreamShaper-Sketch")
253
+ # these are hidden buttons that are used to trigger the canvas changes
254
+ line = gr.Checkbox(label="line", value=False, elem_id="cb-line")
255
+ eraser = gr.Checkbox(label="eraser", value=False, elem_id="cb-eraser")
256
+ with gr.Row(elem_id="main_row"):
257
+ with gr.Column(elem_id="column_input"):
258
+ gr.Markdown("## INPUT", elem_id="input_header")
259
+ image = gr.Image(
260
+ source="canvas", tool="color-sketch", type="pil", image_mode="L",
261
+ invert_colors=True, shape=(512, 512), brush_radius=8, height=440, width=440,
262
+ brush_color="#000000", interactive=True, show_download_button=True, elem_id="input_image", show_label=False)
263
+ download_sketch = gr.Button("Download sketch", scale=1, elem_id="download_sketch")
264
+
265
+ gr.HTML("""
266
+ <div class="button-row">
267
+ <div id="my-div-pencil" class="pad2"> <button id="my-toggle-pencil" onclick="return togglePencil(this)"></button> </div>
268
+ <div id="my-div-eraser" class="pad2"> <button id="my-toggle-eraser" onclick="return toggleEraser(this)"></button> </div>
269
+ <div class="pad2"> <button id="my-button-undo" onclick="return UNDO_SKETCH_FUNCTION(this)"></button> </div>
270
+ <div class="pad2"> <button id="my-button-clear" onclick="return DELETE_SKETCH_FUNCTION(this)"></button> </div>
271
+ <div class="pad2"> <button href="TODO" download="image" id="my-button-down" onclick='return theSketchDownloadFunction()'></button> </div>
272
+ </div>
273
+ """)
274
+ # gr.Markdown("## Prompt", elem_id="tools_header")
275
+ prompt = gr.Textbox(label="Prompt", value="", show_label=True)
276
+ with gr.Row():
277
+ style = gr.Dropdown(label="Style", choices=STYLE_NAMES, value=DEFAULT_STYLE_NAME, scale=1)
278
+ prompt_temp = gr.Textbox(label="Prompt Style Template", value=styles[DEFAULT_STYLE_NAME], scale=2, max_lines=1)
279
+
280
+ controlnet_conditioning_scale = gr.Slider(label="Control Strength", minimum=0, maximum=1, step=0.01, value=0.8)
281
+
282
+
283
+ device_choices = ['GPU','CPU']
284
+ device_type = gr.Radio(device_choices, label='Device',
285
+ value=device_choices[0],
286
+ interactive=True,
287
+ info='Please choose GPU if you have a GPU.')
288
+
289
+ dtype_choices = ['torch.float16','torch.float32']
290
+ param_dtype = gr.Radio(dtype_choices,label='torch.weight_type',
291
+ value=dtype_choices[0],
292
+ interactive=True,
293
+ info='To save GPU memory, use torch.float16. For better quality, use torch.float32.')
294
+
295
+
296
+ with gr.Column(elem_id="column_process", min_width=50, scale=0.4):
297
+ gr.Markdown("## SDXS-Sketch", elem_id="description")
298
+ run_button = gr.Button("Run", min_width=50)
299
+
300
+ with gr.Column(elem_id="column_output"):
301
+ gr.Markdown("## OUTPUT", elem_id="output_header")
302
+ result = gr.Image(label="Result", height=440, width=440, elem_id="output_image", show_label=False, show_download_button=True)
303
+ download_output = gr.Button("Download output", elem_id="download_output")
304
+ gr.Markdown("### Instructions")
305
+ gr.Markdown("**1**. Enter a text prompt (e.g. cat)")
306
+ gr.Markdown("**2**. Start sketching")
307
+ gr.Markdown("**3**. Change the image style using a style template")
308
+ gr.Markdown("**4**. Adjust the effect of sketch guidance using the slider")
309
+
310
+
311
+ eraser.change(fn=lambda x: gr.update(value=not x), inputs=[eraser], outputs=[line]).then(update_canvas, [line, eraser], [image])
312
+ line.change(fn=lambda x: gr.update(value=not x), inputs=[line], outputs=[eraser]).then(update_canvas, [line, eraser], [image])
313
+
314
+ demo.load(None,None,None,_js=scripts)
315
+ inputs = [image, prompt, prompt_temp, style, controlnet_conditioning_scale, device_type, param_dtype]
316
+ outputs = [result, download_sketch, download_output]
317
+ prompt.submit(fn=run, inputs=inputs, outputs=outputs)
318
+ style.change(lambda x: styles[x], inputs=[style], outputs=[prompt_temp]).then(
319
+ fn=run, inputs=inputs, outputs=outputs,)
320
+ run_button.click(fn=run, inputs=inputs, outputs=outputs)
321
+ image.change(run, inputs=inputs, outputs=outputs,)
322
+
323
+ if __name__ == "__main__":
324
+ demo.queue().launch(debug=True, share=True)
demo_webcam.py ADDED
@@ -0,0 +1,128 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import random
2
+ import numpy as np
3
+ from PIL import Image
4
+ import base64
5
+ from io import BytesIO
6
+
7
+ import torch
8
+ import torchvision.transforms.functional as F
9
+ from diffusers import ControlNetModel, StableDiffusionControlNetPipeline
10
+ import gradio as gr
11
+
12
+ device = "mps" # Linux & Windows
13
+ weight_type = torch.float16 # torch.float16 works as well, but pictures seem to be a bit worse
14
+
15
+ controlnet = ControlNetModel.from_pretrained(
16
+ "IDKiro/sdxs-512-dreamshaper-sketch", torch_dtype=weight_type
17
+ ).to(device)
18
+ pipe = StableDiffusionControlNetPipeline.from_pretrained(
19
+ "IDKiro/sdxs-512-dreamshaper", controlnet=controlnet, torch_dtype=weight_type
20
+ )
21
+ pipe.to(device)
22
+
23
+ style_list = [
24
+ {
25
+ "name": "No Style",
26
+ "prompt": "{prompt}",
27
+ },
28
+ {
29
+ "name": "Cinematic",
30
+ "prompt": "cinematic still {prompt} . emotional, harmonious, vignette, highly detailed, high budget, bokeh, cinemascope, moody, epic, gorgeous, film grain, grainy",
31
+ },
32
+ # Additional styles omitted for brevity
33
+ ]
34
+
35
+ styles = {k["name"]: k["prompt"] for k in style_list}
36
+ STYLE_NAMES = list(styles.keys())
37
+ DEFAULT_STYLE_NAME = "No Style"
38
+ MAX_SEED = np.iinfo(np.int32).max
39
+
40
+
41
+ def pil_image_to_data_url(img, format="PNG"):
42
+ buffered = BytesIO()
43
+ img.save(buffered, format=format)
44
+ img_str = base64.b64encode(buffered.getvalue()).decode()
45
+ return f"data:image/{format.lower()};base64,{img_str}"
46
+
47
+
48
+ def run(
49
+ image,
50
+ prompt,
51
+ prompt_template,
52
+ style_name,
53
+ controlnet_conditioning_scale,
54
+ device_type="GPU",
55
+ param_dtype='torch.float16',
56
+ ):
57
+ if device_type == "CPU":
58
+ device = "cpu"
59
+ param_dtype = 'torch.float32'
60
+ else:
61
+ device = "cuda"
62
+
63
+ pipe.to(torch_device=device, torch_dtype=torch.float16 if param_dtype == 'torch.float16' else torch.float32)
64
+
65
+ print(f"prompt: {prompt}")
66
+ if image is None:
67
+ ones = Image.new("L", (512, 512), 255)
68
+ temp_url = pil_image_to_data_url(ones)
69
+ return ones, gr.update(link=temp_url), gr.update(link=temp_url)
70
+ prompt = prompt_template.replace("{prompt}", prompt)
71
+ control_image = image.convert("RGB")
72
+ control_image = Image.fromarray(255 - np.array(control_image))
73
+
74
+ output_pil = pipe(
75
+ prompt=prompt,
76
+ image=control_image,
77
+ width=512,
78
+ height=512,
79
+ guidance_scale=0.0,
80
+ num_inference_steps=1,
81
+ num_images_per_prompt=1,
82
+ output_type="pil",
83
+ controlnet_conditioning_scale=controlnet_conditioning_scale,
84
+ ).images[0]
85
+
86
+ input_image_url = pil_image_to_data_url(control_image)
87
+ output_image_url = pil_image_to_data_url(output_pil)
88
+ return (
89
+ output_pil,
90
+ gr.update(link=input_image_url),
91
+ gr.update(link=output_image_url),
92
+ )
93
+
94
+
95
+ with gr.Blocks(css="style.css") as demo:
96
+ gr.Markdown("# SDXS-512-DreamShaper-Webcam")
97
+ with gr.Row():
98
+ with gr.Column():
99
+ gr.Markdown("## INPUT")
100
+ # Replace canvas with webcam image
101
+ image = gr.Image(
102
+ source="webcam", type="pil", label="Webcam Image", interactive=True
103
+ )
104
+
105
+ prompt = gr.Textbox(label="Prompt", value="", show_label=True)
106
+ style = gr.Dropdown(label="Style", choices=STYLE_NAMES, value=DEFAULT_STYLE_NAME)
107
+ prompt_template = gr.Textbox(label="Prompt Style Template", value=styles[DEFAULT_STYLE_NAME])
108
+
109
+ controlnet_conditioning_scale = gr.Slider(label="Control Strength", minimum=0, maximum=1, step=0.01, value=0.8)
110
+
111
+ device_choices = ['GPU','CPU']
112
+ device_type = gr.Radio(device_choices, label='Device', value=device_choices[0], interactive=True)
113
+
114
+ dtype_choices = ['torch.float16','torch.float32']
115
+ param_dtype = gr.Radio(dtype_choices, label='torch.weight_type', value=dtype_choices[0], interactive=True)
116
+
117
+ with gr.Column():
118
+ gr.Markdown("## OUTPUT")
119
+ result = gr.Image(label="Result", show_label=False, show_download_button=True)
120
+
121
+ inputs = [image, prompt, prompt_template, style, controlnet_conditioning_scale, device_type, param_dtype]
122
+ outputs = [result]
123
+ prompt.submit(fn=run, inputs=inputs, outputs=outputs)
124
+ style.change(lambda x: styles[x], inputs=[style], outputs=[prompt_template])
125
+ image.change(run, inputs=inputs, outputs=outputs)
126
+
127
+ if __name__ == "__main__":
128
+ demo.queue().launch(debug=True)
demo_webcam_photo.py ADDED
@@ -0,0 +1,20 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import gradio as gr
2
+
3
+ # Function to display webcam image on canvas
4
+ def display_webcam_image(img):
5
+ return img
6
+
7
+ # Gradio app interface
8
+ with gr.Blocks() as demo:
9
+ gr.Markdown("## Webcam Capture and Display")
10
+ # Webcam component
11
+ webcam = gr.Image(source="webcam", label="Webcam Capture", streaming=True)
12
+ # Canvas to display captured image
13
+ canvas = gr.Image(label="Captured Image")
14
+
15
+ # Button to capture image from webcam and display on canvas
16
+ capture_button = gr.Button("Capture Image")
17
+ capture_button.click(fn=display_webcam_image, inputs=webcam, outputs=canvas)
18
+
19
+ # Launch the app
20
+ demo.launch()
images/control_imgs.png ADDED

Git LFS Details

  • SHA256: 4b270acf3cf3634aecbf4835a1c56f0e31010e8a7134c0e71be26c6c02199109
  • Pointer size: 132 Bytes
  • Size of remote file: 1.81 MB
images/imgs.png ADDED

Git LFS Details

  • SHA256: af7366c2cda944124e6de8ce57c7e111ea0181597890e3e7ff23153c1f216732
  • Pointer size: 132 Bytes
  • Size of remote file: 3.5 MB
images/intro.png ADDED

Git LFS Details

  • SHA256: e4a09e4d67a4add074a14c35059df13ba306eb3fc7abafe9ee629c6945a71792
  • Pointer size: 132 Bytes
  • Size of remote file: 2.71 MB
images/method1.png ADDED
images/method2.png ADDED
images/method3.png ADDED
images/sketch.gif ADDED
images/speed.png ADDED
requirements.txt ADDED
@@ -0,0 +1,13 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ einops>=0.6.1
2
+ numpy>=1.24.4
3
+ opencv-python==4.6.0.66
4
+ pillow>=9.5.0
5
+ scipy==1.11.1
6
+ timm>=0.9.2
7
+ tqdm>=4.65.0
8
+ diffusers==0.25.1
9
+ gradio==3.43.1
10
+ tokenizers
11
+ transformers
12
+ accelerate
13
+ peft
style.css ADDED
@@ -0,0 +1,213 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ @import url('https://cdnjs.cloudflare.com/ajax/libs/font-awesome/5.15.1/css/all.min.css');
2
+
3
+ /* the outermost contrained of the app */
4
+ .main{
5
+ display: flex;
6
+ justify-content: center;
7
+ align-items: center;
8
+ width: 1200px;
9
+ }
10
+
11
+ /* #main_row{
12
+
13
+ } */
14
+
15
+ /* hide this class */
16
+ .svelte-p4aq0j {
17
+ display: none;
18
+ }
19
+
20
+ .wrap.svelte-p4aq0j.svelte-p4aq0j {
21
+ display: none;
22
+ }
23
+
24
+ #download_sketch{
25
+ display: none;
26
+ }
27
+
28
+ #download_output{
29
+ display: none;
30
+ }
31
+
32
+ #column_input, #column_output{
33
+ width: 500px;
34
+ display: flex;
35
+ /* justify-content: center; */
36
+ align-items: center;
37
+ }
38
+
39
+ #tools_header, #input_header, #output_header, #process_header {
40
+ display: flex;
41
+ justify-content: center;
42
+ align-items: center;
43
+ width: 400px;
44
+ }
45
+
46
+
47
+ #nn{
48
+ width: 100px;
49
+ height: 100px;
50
+ }
51
+
52
+
53
+ #column_process{
54
+ display: flex;
55
+ justify-content: center; /* Center horizontally */
56
+ align-items: center; /* Center vertically */
57
+ height: 600px;
58
+ }
59
+
60
+ /* this is the "pix2pix-turbo" above the process button */
61
+ #description > span{
62
+ display: flex;
63
+ justify-content: center; /* Center horizontally */
64
+ align-items: center; /* Center vertically */
65
+ }
66
+
67
+ /* this is the "UNDO_BUTTON, X_BUTTON" */
68
+ div.svelte-1030q2h{
69
+ width: 30px;
70
+ height: 30px;
71
+ display: none;
72
+ }
73
+
74
+
75
+ #component-5 > div{
76
+ border: 0px;
77
+ box-shadow: none;
78
+ }
79
+
80
+ #cb-eraser, #cb-line{
81
+ display: none;
82
+ }
83
+
84
+ /* eraser text */
85
+ #cb-eraser > label > span{
86
+ display: none;
87
+ }
88
+ #cb-line > label > span{
89
+ display: none;
90
+ }
91
+
92
+
93
+ .button-row {
94
+ display: flex;
95
+ justify-content: center;
96
+ align-items: center;
97
+ height: 50px;
98
+ border: 0px;
99
+ }
100
+
101
+ #my-toggle-pencil{
102
+ background-image: url("https://icons.getbootstrap.com/assets/icons/pencil.svg");
103
+ background-color: white;
104
+ background-size: cover;
105
+ margin: 0px;
106
+ box-shadow: none;
107
+ width: 40px;
108
+ height: 40px;
109
+ }
110
+
111
+ #my-toggle-pencil.clicked{
112
+ background-image: url("https://icons.getbootstrap.com/assets/icons/pencil-fill.svg");
113
+ transform: scale(0.98);
114
+ background-color: gray;
115
+ background-size: cover;
116
+ /* background-size: 95%;
117
+ background-position: center; */
118
+ /* border: 2px solid #000; */
119
+ margin: 0px;
120
+ box-shadow: none;
121
+ width: 40px;
122
+ height: 40px;
123
+ }
124
+
125
+
126
+ #my-toggle-eraser{
127
+ background-image: url("https://icons.getbootstrap.com/assets/icons/eraser.svg");
128
+ background-color: white;
129
+ background-color: white;
130
+ background-size: cover;
131
+ margin: 0px;
132
+ box-shadow: none;
133
+ width: 40px;
134
+ height: 40px;
135
+ }
136
+
137
+ #my-toggle-eraser.clicked{
138
+ background-image: url("https://icons.getbootstrap.com/assets/icons/eraser-fill.svg");
139
+ transform: scale(0.98);
140
+ background-color: gray;
141
+ background-size: cover;
142
+ margin: 0px;
143
+ box-shadow: none;
144
+ width: 40px;
145
+ height: 40px;
146
+ }
147
+
148
+
149
+
150
+ #my-button-undo{
151
+ background-image: url("https://icons.getbootstrap.com/assets/icons/arrow-counterclockwise.svg");
152
+ background-color: white;
153
+ background-size: cover;
154
+ margin: 0px;
155
+ box-shadow: none;
156
+ width: 40px;
157
+ height: 40px;
158
+ }
159
+
160
+ #my-button-clear{
161
+ background-image: url("https://icons.getbootstrap.com/assets/icons/x-lg.svg");
162
+ background-color: white;
163
+ background-size: cover;
164
+ margin: 0px;
165
+ box-shadow: none;
166
+ width: 40px;
167
+ height: 40px;
168
+
169
+ }
170
+
171
+
172
+ #my-button-down{
173
+ background-image: url("https://icons.getbootstrap.com/assets/icons/arrow-down.svg");
174
+ background-color: white;
175
+ background-size: cover;
176
+ margin: 0px;
177
+ box-shadow: none;
178
+ width: 40px;
179
+ height: 40px;
180
+
181
+ }
182
+
183
+ .pad2{
184
+ padding: 2px;
185
+ background-color: white;
186
+ border: 2px solid #000;
187
+ margin: 10px;
188
+ display: flex;
189
+ justify-content: center; /* Center horizontally */
190
+ align-items: center; /* Center vertically */
191
+ }
192
+
193
+
194
+
195
+
196
+ #output_image, #input_image{
197
+ border-radius: 0px;
198
+ border: 5px solid #000;
199
+ border-width: none;
200
+ }
201
+
202
+
203
+ #output_image > img{
204
+ border: 5px solid #000;
205
+ border-radius: 0px;
206
+ border-width: none;
207
+ }
208
+
209
+ #input_image > div.image-container.svelte-p3y7hu > div.wrap.svelte-yigbas > canvas:nth-child(1){
210
+ border: 5px solid #000;
211
+ border-radius: 0px;
212
+ border-width: none;
213
+ }