Files changed (1) hide show
  1. README.md +81 -81
README.md CHANGED
@@ -1,135 +1,135 @@
1
  ---
2
- pipeline_tag: text-to-image
3
  inference: false
4
  license: other
5
  license_name: sai-nc-community
6
- license_link: https://huggingface.co/stabilityai/sdxl-turbo/blob/main/LICENSE.md
7
  ---
8
 
9
- # SDXL-Turbo Model Card
10
 
11
- <!-- Provide a quick summary of what the model is/does. -->
12
  ![row01](output_tile.jpg)
13
- SDXL-Turbo is a fast generative text-to-image model that can synthesize photorealistic images from a text prompt in a single network evaluation.
14
- A real-time demo is available here: http://clipdrop.co/stable-diffusion-turbo
15
 
16
- Please note: For commercial use, please refer to https://stability.ai/license.
17
 
18
- ## Model Details
19
 
20
- ### Model Description
21
- SDXL-Turbo is a distilled version of [SDXL 1.0](https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0), trained for real-time synthesis.
22
- SDXL-Turbo is based on a novel training method called Adversarial Diffusion Distillation (ADD) (see the [technical report](https://stability.ai/research/adversarial-diffusion-distillation)), which allows sampling large-scale foundational
23
- image diffusion models in 1 to 4 steps at high image quality.
24
- This approach uses score distillation to leverage large-scale off-the-shelf image diffusion models as a teacher signal and combines this with an
25
- adversarial loss to ensure high image fidelity even in the low-step regime of one or two sampling steps.
26
 
27
- - **Developed by:** Stability AI
28
- - **Funded by:** Stability AI
29
- - **Model type:** Generative text-to-image model
30
- - **Finetuned from model:** [SDXL 1.0 Base](https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0)
31
 
32
- ### Model Sources
33
 
34
- For research purposes, we recommend our `generative-models` Github repository (https://github.com/Stability-AI/generative-models),
35
- which implements the most popular diffusion frameworks (both training and inference).
36
 
37
- - **Repository:** https://github.com/Stability-AI/generative-models
38
- - **Paper:** https://stability.ai/research/adversarial-diffusion-distillation
39
- - **Demo:** http://clipdrop.co/stable-diffusion-turbo
40
 
41
 
42
- ## Evaluation
43
- ![comparison1](image_quality_one_step.png)
44
- ![comparison2](prompt_alignment_one_step.png)
45
- The charts above evaluate user preference for SDXL-Turbo over other single- and multi-step models.
46
- SDXL-Turbo evaluated at a single step is preferred by human voters in terms of image quality and prompt following over LCM-XL evaluated at four (or fewer) steps.
47
- In addition, we see that using four steps for SDXL-Turbo further improves performance.
48
- For details on the user study, we refer to the [research paper](https://stability.ai/research/adversarial-diffusion-distillation).
49
 
50
 
51
- ## Uses
52
 
53
- ### Direct Use
54
 
55
- The model is intended for both non-commercial and commercial usage. You can use this model for non-commercial or research purposes under this [license](https://huggingface.co/stabilityai/sdxl-turbo/blob/main/LICENSE.md). Possible research areas and tasks include
56
 
57
- - Research on generative models.
58
- - Research on real-time applications of generative models.
59
- - Research on the impact of real-time generative models.
60
- - Safe deployment of models which have the potential to generate harmful content.
61
- - Probing and understanding the limitations and biases of generative models.
62
- - Generation of artworks and use in design and other artistic processes.
63
- - Applications in educational or creative tools.
64
 
65
- For commercial use, please refer to https://stability.ai/membership.
66
 
67
- Excluded uses are described below.
68
 
69
- ### Diffusers
70
 
71
  ```
72
- pip install diffusers transformers accelerate --upgrade
73
  ```
74
 
75
- - **Text-to-image**:
76
 
77
- SDXL-Turbo does not make use of `guidance_scale` or `negative_prompt`, we disable it with `guidance_scale=0.0`.
78
- Preferably, the model generates images of size 512x512 but higher image sizes work as well.
79
- A **single step** is enough to generate high quality images.
80
 
81
  ```py
82
- from diffusers import AutoPipelineForText2Image
83
- import torch
84
 
85
- pipe = AutoPipelineForText2Image.from_pretrained("stabilityai/sdxl-turbo", torch_dtype=torch.float16, variant="fp16")
86
- pipe.to("cuda")
87
 
88
- prompt = "A cinematic shot of a baby racoon wearing an intricate italian priest robe."
89
 
90
- image = pipe(prompt=prompt, num_inference_steps=1, guidance_scale=0.0).images[0]
91
  ```
92
 
93
- - **Image-to-image**:
94
 
95
- When using SDXL-Turbo for image-to-image generation, make sure that `num_inference_steps` * `strength` is larger or equal
96
- to 1. The image-to-image pipeline will run for `int(num_inference_steps * strength)` steps, *e.g.* 0.5 * 2.0 = 1 step in our example
97
- below.
98
 
99
  ```py
100
- from diffusers import AutoPipelineForImage2Image
101
- from diffusers.utils import load_image
102
- import torch
103
 
104
- pipe = AutoPipelineForImage2Image.from_pretrained("stabilityai/sdxl-turbo", torch_dtype=torch.float16, variant="fp16")
105
- pipe.to("cuda")
106
 
107
- init_image = load_image("https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/cat.png").resize((512, 512))
108
 
109
- prompt = "cat wizard, gandalf, lord of the rings, detailed, fantasy, cute, adorable, Pixar, Disney, 8k"
110
 
111
- image = pipe(prompt, image=init_image, num_inference_steps=2, strength=0.5, guidance_scale=0.0).images[0]
112
  ```
113
 
114
- ### Out-of-Scope Use
115
 
116
- The model was not trained to be factual or true representations of people or events,
117
- and therefore using the model to generate such content is out-of-scope for the abilities of this model.
118
- The model should not be used in any way that violates Stability AI's [Acceptable Use Policy](https://stability.ai/use-policy).
119
 
120
- ## Limitations and Bias
121
 
122
- ### Limitations
123
- - The generated images are of a fixed resolution (512x512 pix), and the model does not achieve perfect photorealism.
124
- - The model cannot render legible text.
125
- - Faces and people in general may not be generated properly.
126
- - The autoencoding part of the model is lossy.
127
 
128
 
129
- ### Recommendations
130
 
131
- The model is intended for both non-commercial and commercial usage.
132
 
133
- ## How to Get Started with the Model
134
 
135
- Check out https://github.com/Stability-AI/generative-models
 
1
  ---
2
+ pipeline_tag: image-to-video
3
  inference: false
4
  license: other
5
  license_name: sai-nc-community
6
+ license_link: https://huggingface.co/stabilityai/sdxl-turbo/blob/main/LICENSE.md
7
  ---
8
 
9
+ #SDXL-Turbo型号卡
10
 
11
+ <!--提供模型功能的快速摘要。-->
12
  ![row01](output_tile.jpg)
13
+ SDXL-Turbo是一种快速生成的文本到图像模型,可以在单个网络评估中从文本提示合成照片级真实感图像。
14
+ 实时演示可在以下位置获得:http://clipdrop.co/stable-diffusion-turbo
15
 
16
+ 请注意:对于商业用途,请参阅https://stability.ai/license.
17
 
18
+ ##模型详细信息
19
 
20
+ ###型号说明
21
+ SDXL-Turbo[SDXL1.0](https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0),受过实时合成训练。
22
+ SDXL-Turbo基于一种称为对抗扩散蒸馏(ADD)的新型训练方法(参见[技术报告](https://stability.ai/research/adversarial-diffusion-distillation)),这允许对基础
23
+ 图像扩散模型在高图像质量的14步中进行。
24
+ 该方法使用分数蒸馏来利用大规模现成的图像扩散模型作为教师信号,并将其与
25
+ 对抗性损失,以确保即使在一个或两个采样步骤的低阶区域中也具有高图像保真度。
26
 
27
+ - **编制单位:**稳定性AI
28
+ - **资金来源:**稳定性AI
29
+ - **型号类型:**生成文本到图像模型
30
+ - **根据模型进行微调:** [SDXL1.0Base](https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0)
31
 
32
+ ###模型源
33
 
34
+ 出于研究目的,我们建议`生成模型`GitHub存储库(https://github.com/Stability-AI/generative-models),
35
+ 它实现了最流行的传播框架(训练和推理)
36
 
37
+ - **存储库:**https://github.com/Stability-AI/generative-models
38
+ - **纸:**https://stability.ai/research/adversarial-diffusion-distillation
39
+ - **演示:**http://clipdrop.co/stable-diffusion-turbo
40
 
41
 
42
+ ##评价
43
+ ![比较1](image_quality_one_step.png)
44
+ ![比较2](prompt_alignment_one_step.png)
45
+ 以上图表评估了SDXL-Turbo与其他单步和多步型号相比的用户偏好。
46
+ 在图像质量和即时跟踪方面,在单步评估的SDXL-Turbo比在四步(或更少)评估的LCM-XL更受人类选民的青睐。
47
+ 此外,我们看到SDXL-Turbo使用四个步骤进一步提高了性能。
48
+ 有关用户研究的详细信息,请参阅[研究论文](https://stability.ai/research/adversarial-diffusion-distillation).
49
 
50
 
51
+ ##uses
52
 
53
+ ###直接使用
54
 
55
+ 该模型既可用于非商业用途,也可用于商业用途。您可根据本协议将该模型用于非商业用途或研究用途。[许可证](https://huggingface.co/stabilityai/sdxl-turbo/blob/main/LICENSE.md)。可能的研究领域和任务包括
56
 
57
+ -生成模型研究。
58
+ -生成模型的实时应用研究。
59
+ -研究实时生成模型的影响。
60
+ -安全部署可能产生有害内容的模型。
61
+ -探索和理解生成模型的局限性和偏差。
62
+ -艺术作品的产生和在设计和其他艺术过程中的使用。
63
+ -在教育或创意工具中的应用。
64
 
65
+ 对于商业用途,请参阅https://stability.ai/membership.
66
 
67
+ 排除的用途描述如下。
68
 
69
+ ###扩散器
70
 
71
  ```
72
+ PIP安装扩散器变压器加速-升级
73
  ```
74
 
75
+ - **文本到图像**:
76
 
77
+ SDXL-Turbo不使用`制导_标度`或`negative_prompt`,我们用来禁用它`guidance_scale=0.0`.
78
+ 优选地,模型生成尺寸为512x512的图像,但是更大的图像尺寸也可以工作。
79
+ A**单步**足以生成高质量图像。
80
 
81
  ```py
82
+ 从散流器导入AutoPipelineForText2Image
83
+ 进口火炬
84
 
85
+ 管道=AutoPipelineForText2Imagefrom_pretrained("stabilityai/sdxl-turbo"torch_dtype=torch.float16variant="fp16")
86
+ pipe.to(cuda)
87
 
88
+ prompt="一只小浣熊穿着复杂的意大利牧师长袍的电影镜头。"
89
 
90
+ image=管道(提示=提示,num_interference_steps=1guidance_scale=0.0)images[0]
91
  ```
92
 
93
+ - **图像到图像**:
94
 
95
+ 使用SDXL-Turbo进行图像到图像生成时,请确保`NUM_interference_steps`*`力量`大于或等于
96
+ 1。映像到映像管道将运行`int(num_interference_steps*strength)`步骤,*例如*在我们的示例中,0.5*2.0=1
97
+ 在……下面。
98
 
99
  ```py
100
+ 从散流器导入AutoPipelineForImage2Image
101
+ diffusers.utils导入load_image
102
+ 进口火炬
103
 
104
+ 管道=AutoPipelineForImage2Imagefrom_pretrained("stabilityai/sdxl-turbo"torch_dtype=torch.float16variant="fp16")
105
+ pipe.to(cuda)
106
 
107
+ init_image=load_image("https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/cat.png").resize((512512))
108
 
109
+ prompt="猫巫师,甘道夫,指环王,详细,梦幻,可爱,可爱,皮克斯,迪斯尼,8k"
110
 
111
+ image=管道(提示,image=init_image,num_interference_steps=2strength=0.5guidance_scale=0.0)images[0]
112
  ```
113
 
114
+ ###超出范围使用
115
 
116
+ 模型并没有被训练成真实或真实的人或事件的表现,
117
+ 因此,使用该模型来生成这样的内容超出了该模型的能力的范围。
118
+ ��模型不应以任何违反稳定性人工智能的方式使用[可接受的使用政策](https://stability.ai/use-policy).
119
 
120
+ ##限制和偏差
121
 
122
+ ###限制
123
+ -生成的图像具有固定的分辨率(512x512像素),并且模型不能实现完美的照片真实感。
124
+ -模型无法呈现清晰可见的文本。
125
+ -一般情况下,人脸和人物可能无法正确生成。
126
+ -模型的自动编码部分有损耗。
127
 
128
 
129
+ ###推荐
130
 
131
+ 该模型既可用于非商业用途,也可用于商业用途。
132
 
133
+ ##如何开始使用模型
134
 
135
+ 结帐https://github.com/Stability-AI/generative-models