Upload 16 files
Browse files- Flash-README.md +79 -0
- Hyper-SD(XL).md +346 -0
- Hyper-SDXL-1step-lora.safetensors +3 -0
- Hyper-SDXL-2steps-lora.safetensors +3 -0
- Hyper-SDXL-4steps-lora.safetensors +3 -0
- PCM-README.md +9 -0
- SLAM-README.md +51 -0
- TCD-SDXL.safetensors +3 -0
- flash-sdxl.safetensors +3 -0
- pcm_sdxl_smallcfg_2step.safetensors +3 -0
- pcm_sdxl_smallcfg_4step.safetensors +3 -0
- sd_xl_turbo_lora_v1.safetensors +3 -0
- sdxl-turbo-dpo-lora.safetensors +3 -0
- sdxl_lightning_2step_lora.safetensors +3 -0
- sdxl_lightning_4step_lora.safetensors +3 -0
- slam-lora-sdxl.safetensors +3 -0
Flash-README.md
ADDED
@@ -0,0 +1,79 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
tags:
|
3 |
+
- text-to-image
|
4 |
+
- stable-diffusion
|
5 |
+
- lora
|
6 |
+
- diffusers
|
7 |
+
- template:sd-lora
|
8 |
+
base_model: stabilityai/stable-diffusion-xl-base-1.0
|
9 |
+
license: cc-by-nc-nd-4.0
|
10 |
+
---
|
11 |
+
# ⚡ Flash Diffusion: FlashSDXL ⚡
|
12 |
+
|
13 |
+
|
14 |
+
Flash Diffusion is a diffusion distillation method proposed in [Flash Diffusion: Accelerating Any Conditional
|
15 |
+
Diffusion Model for Few Steps Image Generation](http://arxiv.org/abs/2406.02347) *by Clément Chadebec, Onur Tasar, Eyal Benaroche, and Benjamin Aubin.*
|
16 |
+
This model is a **108M** LoRA distilled version of [SDXL](https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0) model that is able to generate images in **4 steps**. The main purpose of this model is to reproduce the main results of the paper.
|
17 |
+
See our [live demo](https://huggingface.co/spaces/jasperai/FlashPixart) and [official implementation](https://github.com/gojasper/flash-diffusion).
|
18 |
+
|
19 |
+
|
20 |
+
<p align="center">
|
21 |
+
<img style="width:700px;" src="images/flash_sdxl.jpg">
|
22 |
+
</p>
|
23 |
+
|
24 |
+
# How to use?
|
25 |
+
|
26 |
+
The model can be used using the `DiffusionPipeline` from `diffusers` library directly. It can allow reducing the number of required sampling steps to **4 steps**.
|
27 |
+
|
28 |
+
```python
|
29 |
+
from diffusers import DiffusionPipeline, LCMScheduler
|
30 |
+
|
31 |
+
adapter_id = "jasperai/flash-sdxl"
|
32 |
+
|
33 |
+
pipe = DiffusionPipeline.from_pretrained(
|
34 |
+
"stabilityai/stable-diffusion-xl-base-1.0",
|
35 |
+
use_safetensors=True,
|
36 |
+
)
|
37 |
+
|
38 |
+
pipe.scheduler = LCMScheduler.from_pretrained(
|
39 |
+
"stabilityai/stable-diffusion-xl-base-1.0",
|
40 |
+
subfolder="scheduler",
|
41 |
+
timestep_spacing="trailing",
|
42 |
+
)
|
43 |
+
pipe.to("cuda")
|
44 |
+
|
45 |
+
# Fuse and load LoRA weights
|
46 |
+
pipe.load_lora_weights(adapter_id)
|
47 |
+
pipe.fuse_lora()
|
48 |
+
|
49 |
+
prompt = "A raccoon reading a book in a lush forest."
|
50 |
+
|
51 |
+
image = pipe(prompt, num_inference_steps=4, guidance_scale=0).images[0]
|
52 |
+
```
|
53 |
+
<p align="center">
|
54 |
+
<img style="width:400px;" src="images/raccoon.png">
|
55 |
+
</p>
|
56 |
+
|
57 |
+
# Training Details
|
58 |
+
The model was trained for 20k iterations on 4 H100 GPUs (representing approximately a total of 176 GPU hours of training). Please refer to the [paper](http://arxiv.org/abs/2406.02347) for further parameters details.
|
59 |
+
|
60 |
+
**Metrics on COCO 2014 validation (Table 3)**
|
61 |
+
- FID-10k: 21.62 (4 NFE)
|
62 |
+
- CLIP Score: 0.327 (4 NFE)
|
63 |
+
|
64 |
+
## Citation
|
65 |
+
If you find this work useful or use it in your research, please consider citing us
|
66 |
+
|
67 |
+
```bibtex
|
68 |
+
@misc{chadebec2024flash,
|
69 |
+
title={Flash Diffusion: Accelerating Any Conditional Diffusion Model for Few Steps Image Generation},
|
70 |
+
author={Clement Chadebec and Onur Tasar and Eyal Benaroche and Benjamin Aubin},
|
71 |
+
year={2024},
|
72 |
+
eprint={2406.02347},
|
73 |
+
archivePrefix={arXiv},
|
74 |
+
primaryClass={cs.CV}
|
75 |
+
}
|
76 |
+
```
|
77 |
+
|
78 |
+
## License
|
79 |
+
This model is released under the the Creative Commons BY-NC license.
|
Hyper-SD(XL).md
ADDED
@@ -0,0 +1,346 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
license: openrail++
|
3 |
+
library_name: diffusers
|
4 |
+
inference: false
|
5 |
+
tags:
|
6 |
+
- lora
|
7 |
+
- text-to-image
|
8 |
+
- stable-diffusion
|
9 |
+
---
|
10 |
+
|
11 |
+
# Hyper-SD
|
12 |
+
Official Repository of the paper: *[Hyper-SD](https://arxiv.org/abs/2404.13686)*.
|
13 |
+
|
14 |
+
Project Page: https://hyper-sd.github.io/
|
15 |
+
|
16 |
+

|
17 |
+
|
18 |
+
|
19 |
+
## News🔥🔥🔥
|
20 |
+
|
21 |
+
* May.13, 2024. 💥💥💥 The **12-Steps CFG-Preserved** [Hyper-SDXL-12steps-CFG-LoRA](https://huggingface.co/ByteDance/Hyper-SD/blob/main/Hyper-SDXL-12steps-CFG-lora.safetensors) and [Hyper-SD15-12steps-CFG-LoRA](https://huggingface.co/ByteDance/Hyper-SD/blob/main/Hyper-SD15-12steps-CFG-lora.safetensors) is also available now(support 5~8 guidance scales), this could be more practical with better trade-off between performance and speed. Enjoy! 💥💥💥
|
22 |
+
* Apr.30, 2024. Our **8-Steps CFG-Preserved** [Hyper-SDXL-8steps-CFG-LoRA](https://huggingface.co/ByteDance/Hyper-SD/blob/main/Hyper-SDXL-8steps-CFG-lora.safetensors) and [Hyper-SD15-8steps-CFG-LoRA](https://huggingface.co/ByteDance/Hyper-SD/blob/main/Hyper-SD15-8steps-CFG-lora.safetensors) is available now(support 5~8 guidance scales), we strongly recommend making the 8-step CFGLora a standard configuration for all SDXL and SD15 models!!!
|
23 |
+
* Apr.28, 2024. ComfyUI workflows on 1-Step Unified LoRA 🥰 with TCDScheduler to inference on different steps are [released](https://huggingface.co/ByteDance/Hyper-SD/tree/main/comfyui)! Remember to install ⭕️ [ComfyUI-TCD](https://github.com/JettHu/ComfyUI-TCD) in your `ComfyUI/custom_nodes` folder!!! You're encouraged to adjust the eta parameter to get better results 🌟!
|
24 |
+
* Apr.26, 2024. Thanks to @[Pete](https://huggingface.co/pngwn) for contributing to our [scribble demo](https://huggingface.co/spaces/ByteDance/Hyper-SD15-Scribble) with larger canvas right now 👏.
|
25 |
+
* Apr.24, 2024. The ComfyUI [workflow](https://huggingface.co/ByteDance/Hyper-SD/blob/main/comfyui/Hyper-SDXL-1step-Unet-workflow.json) and [checkpoint](https://huggingface.co/ByteDance/Hyper-SD/blob/main/Hyper-SDXL-1step-Unet-Comfyui.fp16.safetensors) on 1-Step SDXL UNet ✨ is also available! Don't forget ⭕️ to install the custom [scheduler](https://huggingface.co/ByteDance/Hyper-SD/tree/main/comfyui/ComfyUI-HyperSDXL1StepUnetScheduler) in your `ComfyUI/custom_nodes` folder!!!
|
26 |
+
* Apr.23, 2024. ComfyUI workflows on N-Steps LoRAs are [released](https://huggingface.co/ByteDance/Hyper-SD/tree/main/comfyui)! Worth a try for creators 💥!
|
27 |
+
* Apr.23, 2024. Our technical report 📚 is uploaded to [arXiv](https://arxiv.org/abs/2404.13686)! Many implementation details are provided and we welcome more discussions👏.
|
28 |
+
* Apr.21, 2024. Hyper-SD ⚡️ is highly compatible and work well with different base models and controlnets. To clarify, we also append the usage example of controlnet [here](https://huggingface.co/ByteDance/Hyper-SD#controlnet-usage).
|
29 |
+
* Apr.20, 2024. Our checkpoints and two demos 🤗 (i.e. [SD15-Scribble](https://huggingface.co/spaces/ByteDance/Hyper-SD15-Scribble) and [SDXL-T2I](https://huggingface.co/spaces/ByteDance/Hyper-SDXL-1Step-T2I)) are publicly available on [HuggingFace Repo](https://huggingface.co/ByteDance/Hyper-SD).
|
30 |
+
|
31 |
+
## Try our Hugging Face demos:
|
32 |
+
Hyper-SD Scribble demo host on [🤗 scribble](https://huggingface.co/spaces/ByteDance/Hyper-SD15-Scribble)
|
33 |
+
|
34 |
+
Hyper-SDXL One-step Text-to-Image demo host on [🤗 T2I](https://huggingface.co/spaces/ByteDance/Hyper-SDXL-1Step-T2I)
|
35 |
+
|
36 |
+
## Introduction
|
37 |
+
|
38 |
+
Hyper-SD is one of the new State-of-the-Art diffusion model acceleration techniques.
|
39 |
+
In this repository, we release the models distilled from [SDXL Base 1.0](https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0) and [Stable-Diffusion v1-5](https://huggingface.co/runwayml/stable-diffusion-v1-5)。
|
40 |
+
|
41 |
+
## Checkpoints
|
42 |
+
|
43 |
+
* `Hyper-SDXL-Nstep-lora.safetensors`: Lora checkpoint, for SDXL-related models.
|
44 |
+
* `Hyper-SD15-Nstep-lora.safetensors`: Lora checkpoint, for SD1.5-related models.
|
45 |
+
* `Hyper-SDXL-1step-unet.safetensors`: Unet checkpoint distilled from SDXL-Base.
|
46 |
+
|
47 |
+
## Text-to-Image Usage
|
48 |
+
### SDXL-related models
|
49 |
+
#### 2-Steps, 4-Steps, 8-steps LoRA
|
50 |
+
Take the 2-steps LoRA as an example, you can also use other LoRAs for the corresponding inference steps setting.
|
51 |
+
```python
|
52 |
+
import torch
|
53 |
+
from diffusers import DiffusionPipeline, DDIMScheduler
|
54 |
+
from huggingface_hub import hf_hub_download
|
55 |
+
base_model_id = "stabilityai/stable-diffusion-xl-base-1.0"
|
56 |
+
repo_name = "ByteDance/Hyper-SD"
|
57 |
+
# Take 2-steps lora as an example
|
58 |
+
ckpt_name = "Hyper-SDXL-2steps-lora.safetensors"
|
59 |
+
# Load model.
|
60 |
+
pipe = DiffusionPipeline.from_pretrained(base_model_id, torch_dtype=torch.float16, variant="fp16").to("cuda")
|
61 |
+
pipe.load_lora_weights(hf_hub_download(repo_name, ckpt_name))
|
62 |
+
pipe.fuse_lora()
|
63 |
+
# Ensure ddim scheduler timestep spacing set as trailing !!!
|
64 |
+
pipe.scheduler = DDIMScheduler.from_config(pipe.scheduler.config, timestep_spacing="trailing")
|
65 |
+
# lower eta results in more detail
|
66 |
+
prompt="a photo of a cat"
|
67 |
+
image=pipe(prompt=prompt, num_inference_steps=2, guidance_scale=0).images[0]
|
68 |
+
```
|
69 |
+
|
70 |
+
#### Unified LoRA (support 1 to 8 steps inference)
|
71 |
+
You can flexibly adjust the number of inference steps and eta value to achieve best performance.
|
72 |
+
```python
|
73 |
+
import torch
|
74 |
+
from diffusers import DiffusionPipeline, TCDScheduler
|
75 |
+
from huggingface_hub import hf_hub_download
|
76 |
+
base_model_id = "stabilityai/stable-diffusion-xl-base-1.0"
|
77 |
+
repo_name = "ByteDance/Hyper-SD"
|
78 |
+
ckpt_name = "Hyper-SDXL-1step-lora.safetensors"
|
79 |
+
# Load model.
|
80 |
+
pipe = DiffusionPipeline.from_pretrained(base_model_id, torch_dtype=torch.float16, variant="fp16").to("cuda")
|
81 |
+
pipe.load_lora_weights(hf_hub_download(repo_name, ckpt_name))
|
82 |
+
pipe.fuse_lora()
|
83 |
+
# Use TCD scheduler to achieve better image quality
|
84 |
+
pipe.scheduler = TCDScheduler.from_config(pipe.scheduler.config)
|
85 |
+
# Lower eta results in more detail for multi-steps inference
|
86 |
+
eta=1.0
|
87 |
+
prompt="a photo of a cat"
|
88 |
+
image=pipe(prompt=prompt, num_inference_steps=1, guidance_scale=0, eta=eta).images[0]
|
89 |
+
```
|
90 |
+
|
91 |
+
#### 1-step SDXL Unet
|
92 |
+
Only for the single step inference.
|
93 |
+
```python
|
94 |
+
import torch
|
95 |
+
from diffusers import DiffusionPipeline, UNet2DConditionModel, LCMScheduler
|
96 |
+
from huggingface_hub import hf_hub_download
|
97 |
+
from safetensors.torch import load_file
|
98 |
+
base_model_id = "stabilityai/stable-diffusion-xl-base-1.0"
|
99 |
+
repo_name = "ByteDance/Hyper-SD"
|
100 |
+
ckpt_name = "Hyper-SDXL-1step-Unet.safetensors"
|
101 |
+
# Load model.
|
102 |
+
unet = UNet2DConditionModel.from_config(base_model_id, subfolder="unet").to("cuda", torch.float16)
|
103 |
+
unet.load_state_dict(load_file(hf_hub_download(repo_name, ckpt_name), device="cuda"))
|
104 |
+
pipe = DiffusionPipeline.from_pretrained(base_model_id, unet=unet, torch_dtype=torch.float16, variant="fp16").to("cuda")
|
105 |
+
# Use LCM scheduler instead of ddim scheduler to support specific timestep number inputs
|
106 |
+
pipe.scheduler = LCMScheduler.from_config(pipe.scheduler.config)
|
107 |
+
# Set start timesteps to 800 in the one-step inference to get better results
|
108 |
+
prompt="a photo of a cat"
|
109 |
+
image=pipe(prompt=prompt, num_inference_steps=1, guidance_scale=0, timesteps=[800]).images[0]
|
110 |
+
```
|
111 |
+
|
112 |
+
|
113 |
+
### SD1.5-related models
|
114 |
+
|
115 |
+
#### 2-Steps, 4-Steps, 8-steps LoRA
|
116 |
+
Take the 2-steps LoRA as an example, you can also use other LoRAs for the corresponding inference steps setting.
|
117 |
+
```python
|
118 |
+
import torch
|
119 |
+
from diffusers import DiffusionPipeline, DDIMScheduler
|
120 |
+
from huggingface_hub import hf_hub_download
|
121 |
+
base_model_id = "runwayml/stable-diffusion-v1-5"
|
122 |
+
repo_name = "ByteDance/Hyper-SD"
|
123 |
+
# Take 2-steps lora as an example
|
124 |
+
ckpt_name = "Hyper-SD15-2steps-lora.safetensors"
|
125 |
+
# Load model.
|
126 |
+
pipe = DiffusionPipeline.from_pretrained(base_model_id, torch_dtype=torch.float16, variant="fp16").to("cuda")
|
127 |
+
pipe.load_lora_weights(hf_hub_download(repo_name, ckpt_name))
|
128 |
+
pipe.fuse_lora()
|
129 |
+
# Ensure ddim scheduler timestep spacing set as trailing !!!
|
130 |
+
pipe.scheduler = DDIMScheduler.from_config(pipe.scheduler.config, timestep_spacing="trailing")
|
131 |
+
prompt="a photo of a cat"
|
132 |
+
image=pipe(prompt=prompt, num_inference_steps=2, guidance_scale=0).images[0]
|
133 |
+
```
|
134 |
+
|
135 |
+
|
136 |
+
#### Unified LoRA (support 1 to 8 steps inference)
|
137 |
+
You can flexibly adjust the number of inference steps and eta value to achieve best performance.
|
138 |
+
```python
|
139 |
+
import torch
|
140 |
+
from diffusers import DiffusionPipeline, TCDScheduler
|
141 |
+
from huggingface_hub import hf_hub_download
|
142 |
+
base_model_id = "runwayml/stable-diffusion-v1-5"
|
143 |
+
repo_name = "ByteDance/Hyper-SD"
|
144 |
+
ckpt_name = "Hyper-SD15-1step-lora.safetensors"
|
145 |
+
# Load model.
|
146 |
+
pipe = DiffusionPipeline.from_pretrained(base_model_id, torch_dtype=torch.float16, variant="fp16").to("cuda")
|
147 |
+
pipe.load_lora_weights(hf_hub_download(repo_name, ckpt_name))
|
148 |
+
pipe.fuse_lora()
|
149 |
+
# Use TCD scheduler to achieve better image quality
|
150 |
+
pipe.scheduler = TCDScheduler.from_config(pipe.scheduler.config)
|
151 |
+
# Lower eta results in more detail for multi-steps inference
|
152 |
+
eta=1.0
|
153 |
+
prompt="a photo of a cat"
|
154 |
+
image=pipe(prompt=prompt, num_inference_steps=1, guidance_scale=0, eta=eta).images[0]
|
155 |
+
```
|
156 |
+
|
157 |
+
## ControlNet Usage
|
158 |
+
### SDXL-related models
|
159 |
+
|
160 |
+
#### 2-Steps, 4-Steps, 8-steps LoRA
|
161 |
+
Take Canny Controlnet and 2-steps inference as an example:
|
162 |
+
```python
|
163 |
+
import torch
|
164 |
+
from diffusers.utils import load_image
|
165 |
+
import numpy as np
|
166 |
+
import cv2
|
167 |
+
from PIL import Image
|
168 |
+
from diffusers import ControlNetModel, StableDiffusionXLControlNetPipeline, AutoencoderKL, DDIMScheduler
|
169 |
+
from huggingface_hub import hf_hub_download
|
170 |
+
|
171 |
+
# Load original image
|
172 |
+
image = load_image("https://huggingface.co/datasets/hf-internal-testing/diffusers-images/resolve/main/sd_controlnet/hf-logo.png")
|
173 |
+
image = np.array(image)
|
174 |
+
# Prepare Canny Control Image
|
175 |
+
low_threshold = 100
|
176 |
+
high_threshold = 200
|
177 |
+
image = cv2.Canny(image, low_threshold, high_threshold)
|
178 |
+
image = image[:, :, None]
|
179 |
+
image = np.concatenate([image, image, image], axis=2)
|
180 |
+
control_image = Image.fromarray(image)
|
181 |
+
control_image.save("control.png")
|
182 |
+
control_weight = 0.5 # recommended for good generalization
|
183 |
+
|
184 |
+
# Initialize pipeline
|
185 |
+
controlnet = ControlNetModel.from_pretrained(
|
186 |
+
"diffusers/controlnet-canny-sdxl-1.0",
|
187 |
+
torch_dtype=torch.float16
|
188 |
+
)
|
189 |
+
vae = AutoencoderKL.from_pretrained("madebyollin/sdxl-vae-fp16-fix", torch_dtype=torch.float16)
|
190 |
+
pipe = StableDiffusionXLControlNetPipeline.from_pretrained("stabilityai/stable-diffusion-xl-base-1.0", controlnet=controlnet, vae=vae, torch_dtype=torch.float16).to("cuda")
|
191 |
+
|
192 |
+
pipe.load_lora_weights(hf_hub_download("ByteDance/Hyper-SD", "Hyper-SDXL-2steps-lora.safetensors"))
|
193 |
+
# Ensure ddim scheduler timestep spacing set as trailing !!!
|
194 |
+
pipe.scheduler = DDIMScheduler.from_config(pipe.scheduler.config, timestep_spacing="trailing")
|
195 |
+
pipe.fuse_lora()
|
196 |
+
image = pipe("A chocolate cookie", num_inference_steps=2, image=control_image, guidance_scale=0, controlnet_conditioning_scale=control_weight).images[0]
|
197 |
+
image.save('image_out.png')
|
198 |
+
```
|
199 |
+
|
200 |
+
#### Unified LoRA (support 1 to 8 steps inference)
|
201 |
+
Take Canny Controlnet as an example:
|
202 |
+
```python
|
203 |
+
import torch
|
204 |
+
from diffusers.utils import load_image
|
205 |
+
import numpy as np
|
206 |
+
import cv2
|
207 |
+
from PIL import Image
|
208 |
+
from diffusers import ControlNetModel, StableDiffusionXLControlNetPipeline, AutoencoderKL, TCDScheduler
|
209 |
+
from huggingface_hub import hf_hub_download
|
210 |
+
|
211 |
+
# Load original image
|
212 |
+
image = load_image("https://huggingface.co/datasets/hf-internal-testing/diffusers-images/resolve/main/sd_controlnet/hf-logo.png")
|
213 |
+
image = np.array(image)
|
214 |
+
# Prepare Canny Control Image
|
215 |
+
low_threshold = 100
|
216 |
+
high_threshold = 200
|
217 |
+
image = cv2.Canny(image, low_threshold, high_threshold)
|
218 |
+
image = image[:, :, None]
|
219 |
+
image = np.concatenate([image, image, image], axis=2)
|
220 |
+
control_image = Image.fromarray(image)
|
221 |
+
control_image.save("control.png")
|
222 |
+
control_weight = 0.5 # recommended for good generalization
|
223 |
+
|
224 |
+
# Initialize pipeline
|
225 |
+
controlnet = ControlNetModel.from_pretrained(
|
226 |
+
"diffusers/controlnet-canny-sdxl-1.0",
|
227 |
+
torch_dtype=torch.float16
|
228 |
+
)
|
229 |
+
vae = AutoencoderKL.from_pretrained("madebyollin/sdxl-vae-fp16-fix", torch_dtype=torch.float16)
|
230 |
+
pipe = StableDiffusionXLControlNetPipeline.from_pretrained(
|
231 |
+
"stabilityai/stable-diffusion-xl-base-1.0",
|
232 |
+
controlnet=controlnet, vae=vae, torch_dtype=torch.float16).to("cuda")
|
233 |
+
|
234 |
+
# Load Hyper-SD15-1step lora
|
235 |
+
pipe.load_lora_weights(hf_hub_download("ByteDance/Hyper-SD", "Hyper-SDXL-1step-lora.safetensors"))
|
236 |
+
pipe.fuse_lora()
|
237 |
+
# Use TCD scheduler to achieve better image quality
|
238 |
+
pipe.scheduler = TCDScheduler.from_config(pipe.scheduler.config)
|
239 |
+
# Lower eta results in more detail for multi-steps inference
|
240 |
+
eta=1.0
|
241 |
+
image = pipe("A chocolate cookie", num_inference_steps=4, image=control_image, guidance_scale=0, controlnet_conditioning_scale=control_weight, eta=eta).images[0]
|
242 |
+
image.save('image_out.png')
|
243 |
+
```
|
244 |
+
|
245 |
+
### SD1.5-related models
|
246 |
+
|
247 |
+
#### 2-Steps, 4-Steps, 8-steps LoRA
|
248 |
+
Take Canny Controlnet and 2-steps inference as an example:
|
249 |
+
```python
|
250 |
+
import torch
|
251 |
+
from diffusers.utils import load_image
|
252 |
+
import numpy as np
|
253 |
+
import cv2
|
254 |
+
from PIL import Image
|
255 |
+
from diffusers import ControlNetModel, StableDiffusionControlNetPipeline, DDIMScheduler
|
256 |
+
|
257 |
+
from huggingface_hub import hf_hub_download
|
258 |
+
|
259 |
+
controlnet_checkpoint = "lllyasviel/control_v11p_sd15_canny"
|
260 |
+
|
261 |
+
# Load original image
|
262 |
+
image = load_image("https://huggingface.co/lllyasviel/control_v11p_sd15_canny/resolve/main/images/input.png")
|
263 |
+
image = np.array(image)
|
264 |
+
# Prepare Canny Control Image
|
265 |
+
low_threshold = 100
|
266 |
+
high_threshold = 200
|
267 |
+
image = cv2.Canny(image, low_threshold, high_threshold)
|
268 |
+
image = image[:, :, None]
|
269 |
+
image = np.concatenate([image, image, image], axis=2)
|
270 |
+
control_image = Image.fromarray(image)
|
271 |
+
control_image.save("control.png")
|
272 |
+
|
273 |
+
# Initialize pipeline
|
274 |
+
controlnet = ControlNetModel.from_pretrained(controlnet_checkpoint, torch_dtype=torch.float16)
|
275 |
+
pipe = StableDiffusionControlNetPipeline.from_pretrained("runwayml/stable-diffusion-v1-5", controlnet=controlnet, torch_dtype=torch.float16).to("cuda")
|
276 |
+
pipe.load_lora_weights(hf_hub_download("ByteDance/Hyper-SD", "Hyper-SD15-2steps-lora.safetensors"))
|
277 |
+
pipe.fuse_lora()
|
278 |
+
# Ensure ddim scheduler timestep spacing set as trailing !!!
|
279 |
+
pipe.scheduler = DDIMScheduler.from_config(pipe.scheduler.config, timestep_spacing="trailing")
|
280 |
+
image = pipe("a blue paradise bird in the jungle", num_inference_steps=2, image=control_image, guidance_scale=0).images[0]
|
281 |
+
image.save('image_out.png')
|
282 |
+
```
|
283 |
+
|
284 |
+
|
285 |
+
#### Unified LoRA (support 1 to 8 steps inference)
|
286 |
+
Take Canny Controlnet as an example:
|
287 |
+
```python
|
288 |
+
import torch
|
289 |
+
from diffusers.utils import load_image
|
290 |
+
import numpy as np
|
291 |
+
import cv2
|
292 |
+
from PIL import Image
|
293 |
+
from diffusers import ControlNetModel, StableDiffusionControlNetPipeline, TCDScheduler
|
294 |
+
from huggingface_hub import hf_hub_download
|
295 |
+
|
296 |
+
controlnet_checkpoint = "lllyasviel/control_v11p_sd15_canny"
|
297 |
+
|
298 |
+
# Load original image
|
299 |
+
image = load_image("https://huggingface.co/lllyasviel/control_v11p_sd15_canny/resolve/main/images/input.png")
|
300 |
+
image = np.array(image)
|
301 |
+
# Prepare Canny Control Image
|
302 |
+
low_threshold = 100
|
303 |
+
high_threshold = 200
|
304 |
+
image = cv2.Canny(image, low_threshold, high_threshold)
|
305 |
+
image = image[:, :, None]
|
306 |
+
image = np.concatenate([image, image, image], axis=2)
|
307 |
+
control_image = Image.fromarray(image)
|
308 |
+
control_image.save("control.png")
|
309 |
+
|
310 |
+
# Initialize pipeline
|
311 |
+
controlnet = ControlNetModel.from_pretrained(controlnet_checkpoint, torch_dtype=torch.float16)
|
312 |
+
pipe = StableDiffusionControlNetPipeline.from_pretrained("runwayml/stable-diffusion-v1-5", controlnet=controlnet, torch_dtype=torch.float16).to("cuda")
|
313 |
+
# Load Hyper-SD15-1step lora
|
314 |
+
pipe.load_lora_weights(hf_hub_download("ByteDance/Hyper-SD", "Hyper-SD15-1step-lora.safetensors"))
|
315 |
+
pipe.fuse_lora()
|
316 |
+
# Use TCD scheduler to achieve better image quality
|
317 |
+
pipe.scheduler = TCDScheduler.from_config(pipe.scheduler.config)
|
318 |
+
# Lower eta results in more detail for multi-steps inference
|
319 |
+
eta=1.0
|
320 |
+
image = pipe("a blue paradise bird in the jungle", num_inference_steps=1, image=control_image, guidance_scale=0, eta=eta).images[0]
|
321 |
+
image.save('image_out.png')
|
322 |
+
```
|
323 |
+
## Comfyui Usage
|
324 |
+
* `Hyper-SDXL-Nsteps-lora.safetensors`: [text-to-image workflow](https://huggingface.co/ByteDance/Hyper-SD/blob/main/comfyui/Hyper-SDXL-Nsteps-lora-workflow.json)
|
325 |
+
* `Hyper-SD15-Nsteps-lora.safetensors`: [text-to-image workflow](https://huggingface.co/ByteDance/Hyper-SD/blob/main/comfyui/Hyper-SD15-Nsteps-lora-workflow.json)
|
326 |
+
* `Hyper-SDXL-1step-Unet-Comfyui.fp16.safetensors`: [text-to-image workflow](https://huggingface.co/ByteDance/Hyper-SD/blob/main/comfyui/Hyper-SDXL-1step-Unet-workflow.json)
|
327 |
+
* **REQUIREMENT / INSTALL** for 1-Step SDXL UNet: Please install our [scheduler folder](https://huggingface.co/ByteDance/Hyper-SD/tree/main/comfyui/ComfyUI-HyperSDXL1StepUnetScheduler) into your `ComfyUI/custom_nodes` to enable sampling from 800 timestep instead of 999.
|
328 |
+
* i.e. making sure the `ComfyUI/custom_nodes/ComfyUI-HyperSDXL1StepUnetScheduler` folder exist.
|
329 |
+
* For more details, please refer to our [technical report](https://arxiv.org/abs/2404.13686).
|
330 |
+
* `Hyper-SD15-1step-lora.safetensors`: [text-to-image workflow](https://huggingface.co/ByteDance/Hyper-SD/blob/main/comfyui/Hyper-SD15-1step-unified-lora-workflow.json)
|
331 |
+
* `Hyper-SDXL-1step-lora.safetensors`: [text-to-image workflow](https://huggingface.co/ByteDance/Hyper-SD/blob/main/comfyui/Hyper-SDXL-1step-unified-lora-workflow.json)
|
332 |
+
* **REQUIREMENT / INSTALL** for 1-Step Unified LoRAs: Please install the [ComfyUI-TCD](https://github.com/JettHu/ComfyUI-TCD) into your `ComfyUI/custom_nodes` to enable TCDScheduler with support of different inference steps (1~8) using single checkpoint.
|
333 |
+
* i.e. making sure the `ComfyUI/custom_nodes/ComfyUI-TCD` folder exist.
|
334 |
+
* You're encouraged to adjust the eta parameter in TCDScheduler to get better results.
|
335 |
+
|
336 |
+
## Citation
|
337 |
+
```bibtex
|
338 |
+
@misc{ren2024hypersd,
|
339 |
+
title={Hyper-SD: Trajectory Segmented Consistency Model for Efficient Image Synthesis},
|
340 |
+
author={Yuxi Ren and Xin Xia and Yanzuo Lu and Jiacheng Zhang and Jie Wu and Pan Xie and Xing Wang and Xuefeng Xiao},
|
341 |
+
year={2024},
|
342 |
+
eprint={2404.13686},
|
343 |
+
archivePrefix={arXiv},
|
344 |
+
primaryClass={cs.CV}
|
345 |
+
}
|
346 |
+
```
|
Hyper-SDXL-1step-lora.safetensors
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:5a75acf70ca40a9c8ab0a2dd1bf76174c64f9636e98809fe87a223616e3cc4d9
|
3 |
+
size 393854592
|
Hyper-SDXL-2steps-lora.safetensors
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:0ef7e0cc5aa10d89eb7cbb77ec15b83e6ae1a596f75a31bccb227cc6043702e1
|
3 |
+
size 393854592
|
Hyper-SDXL-4steps-lora.safetensors
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:06c37b7cd5f5c5c2aa21deeb69c7b49b6124c4402b8dbd9e78fbf36ca3243b82
|
3 |
+
size 393854592
|
PCM-README.md
ADDED
@@ -0,0 +1,9 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
library_name: diffusers
|
3 |
+
pipeline_tag: text-to-image
|
4 |
+
---
|
5 |
+
# Phased Consistency Model
|
6 |
+
|
7 |
+
LoRA weights of Stable Diffusion v1-5 for fast text-to-image generation.
|
8 |
+
|
9 |
+
[[paper](https://huggingface.co/papers/2405.18407)] [[arXiv](https://arxiv.org/abs/2405.18407)] [[code](https://github.com/G-U-N/Phased-Consistency-Model)] [[project page](https://g-u-n.github.io/projects/pcm)]
|
SLAM-README.md
ADDED
@@ -0,0 +1,51 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
library_name: diffusers
|
3 |
+
base_model: stabilityai/stable-diffusion-xl-base-1.0
|
4 |
+
tags:
|
5 |
+
- text-to-image
|
6 |
+
license: apache-2.0
|
7 |
+
inference: false
|
8 |
+
---
|
9 |
+
# Sub-path Linear Approximation Model (SLAM) LoRA: SDXL
|
10 |
+
Paper: [https://arxiv.org/abs/2404.13903](https://arxiv.org/abs/2404.13903)<br/>
|
11 |
+
Project Page: [https://subpath-linear-approx-model.github.io/](https://subpath-linear-approx-model.github.io/)<br/>
|
12 |
+
The checkpoint is a distilled from [stabilityai/stable-diffusion-xl-base-1.0](https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0) with our proposed Sub-path Linear Approximation Model, which reduces the number of inference steps to only between 2-4 steps.
|
13 |
+
## Usage
|
14 |
+
First, install the latest version of the Diffusers library as well as peft, accelerate and transformers.
|
15 |
+
```bash
|
16 |
+
pip install --upgrade pip
|
17 |
+
pip install --upgrade diffusers transformers accelerate peft
|
18 |
+
```
|
19 |
+
We implement SLAM to be compatible with [LCMScheduler](https://huggingface.co/docs/diffusers/v0.22.3/en/api/schedulers/lcm#diffusers.LCMScheduler). You can use SLAM-LoRA just like you use LCM-LoRA.
|
20 |
+
```python
|
21 |
+
import torch
|
22 |
+
from diffusers import LCMScheduler, AutoPipelineForText2Image
|
23 |
+
|
24 |
+
model_id = "stabilityai/stable-diffusion-xl-base-1.0"
|
25 |
+
adapter_id = "alimama-creative/slam-lora-sdxl"
|
26 |
+
|
27 |
+
pipe = AutoPipelineForText2Image.from_pretrained(model_id, torch_dtype=torch.float16, variant="fp16")
|
28 |
+
pipe.scheduler = LCMScheduler.from_config(pipe.scheduler.config)
|
29 |
+
pipe.to("cuda")
|
30 |
+
|
31 |
+
# load and fuse lcm lora
|
32 |
+
pipe.load_lora_weights(adapter_id)
|
33 |
+
pipe.fuse_lora()
|
34 |
+
|
35 |
+
prompt = "A brown teddy bear holding a glass vase in front of a grave."
|
36 |
+
|
37 |
+
image = pipe(prompt=prompt, num_inference_steps=4, guidance_scale=1.0).images[0]
|
38 |
+
|
39 |
+
```
|
40 |
+
|
41 |
+
|
42 |
+
Compare with latent-consistency/lcm-lora-sdxl.
|
43 |
+
<img src='https://huggingface.co/alimama-creative/slam-lora-sdxl/resolve/main/sdxl_cmp.jpg'>
|
44 |
+
|
45 |
+
---
|
46 |
+
|
47 |
+
More examples:
|
48 |
+
<img src='https://huggingface.co/alimama-creative/slam-lora-sdxl/resolve/main/slam-lora-sdxl.jpg'>
|
49 |
+
|
50 |
+
|
51 |
+
|
TCD-SDXL.safetensors
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:2c777bc60abf41d3eb0fe405d23d73c280a020eea5adf97a82a141592c33feba
|
3 |
+
size 393854624
|
flash-sdxl.safetensors
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:afe2ca6e27c4c6087f50ef42772c45d7b0efbc471b76e422492403f9cae724d7
|
3 |
+
size 371758976
|
pcm_sdxl_smallcfg_2step.safetensors
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:242cbe4695fe3f2e248faa71cf53f2ccbf248a316973e4b2f38ab9e34f35a5ab
|
3 |
+
size 393854624
|
pcm_sdxl_smallcfg_4step.safetensors
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:d0bf40a7f280829195563486bec7253f043a06b1f218602b20901c367641023e
|
3 |
+
size 393854624
|
sd_xl_turbo_lora_v1.safetensors
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:a599c42a9f4f7494c7f410dbc0fd432cf0242720509e9d52fa41aac7a88d1b69
|
3 |
+
size 787342192
|
sdxl-turbo-dpo-lora.safetensors
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:c31b841d4574f724fa9f2600b69fd30c81d2593d2e11a3b6bdf51ec8b8761e6d
|
3 |
+
size 46615272
|
sdxl_lightning_2step_lora.safetensors
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:04fafc778385b24144498a8247498ae4fb7a69f702ea0566bdce2845a31fcc43
|
3 |
+
size 393854592
|
sdxl_lightning_4step_lora.safetensors
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:bf56cf2657efb15e465d81402ed481d1e11c4677e4bcce1bc11fe71ad8506b79
|
3 |
+
size 393854592
|
slam-lora-sdxl.safetensors
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:22569a946b0db645aa3b8eb782c674c8e726a7cc0d655887c21fecf6dfe6ad91
|
3 |
+
size 393854592
|