openvino-model

#19
by echarlaix HF staff - opened
README.md CHANGED
@@ -48,7 +48,7 @@ The SDXL base model performs significantly better than the previous variants, an
48
 
49
  ### 🧨 Diffusers
50
 
51
- Make sure to upgrade diffusers to >= 0.18.0:
52
  ```
53
  pip install diffusers --upgrade
54
  ```
@@ -58,7 +58,8 @@ In addition make sure to install `transformers`, `safetensors`, `accelerate` as
58
  pip install invisible_watermark transformers accelerate safetensors
59
  ```
60
 
61
- You can use the model then as follows
 
62
  ```py
63
  from diffusers import DiffusionPipeline
64
  import torch
@@ -74,6 +75,48 @@ prompt = "An astronaut riding a green horse"
74
  images = pipe(prompt=prompt).images[0]
75
  ```
76
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
77
  When using `torch >= 2.0`, you can improve the inference speed by 20-30% with torch.compile. Simple wrap the unet with torch compile before running the pipeline:
78
  ```py
79
  pipe.unet = torch.compile(pipe.unet, mode="reduce-overhead", fullgraph=True)
@@ -87,6 +130,58 @@ instead of `.to("cuda")`:
87
  + pipe.enable_model_cpu_offload()
88
  ```
89
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
90
 
91
  ## Uses
92
 
@@ -117,4 +212,4 @@ The model was not trained to be factual or true representations of people or eve
117
  - The autoencoding part of the model is lossy.
118
 
119
  ### Bias
120
- While the capabilities of image generation models are impressive, they can also reinforce or exacerbate social biases.
 
48
 
49
  ### 🧨 Diffusers
50
 
51
+ Make sure to upgrade diffusers to >= 0.19.0:
52
  ```
53
  pip install diffusers --upgrade
54
  ```
 
58
  pip install invisible_watermark transformers accelerate safetensors
59
  ```
60
 
61
+ To just use the base model, you can run:
62
+
63
  ```py
64
  from diffusers import DiffusionPipeline
65
  import torch
 
75
  images = pipe(prompt=prompt).images[0]
76
  ```
77
 
78
+ To use the whole base + refiner pipeline as an ensemble of experts you can run:
79
+
80
+ ```py
81
+ from diffusers import DiffusionPipeline
82
+ import torch
83
+
84
+ # load both base & refiner
85
+ base = DiffusionPipeline.from_pretrained(
86
+ "stabilityai/stable-diffusion-xl-base-1.0", torch_dtype=torch.float16, variant="fp16", use_safetensors=True
87
+ )
88
+ base.to("cuda")
89
+ refiner = DiffusionPipeline.from_pretrained(
90
+ "stabilityai/stable-diffusion-xl-refiner-1.0",
91
+ text_encoder_2=base.text_encoder_2,
92
+ vae=base.vae,
93
+ torch_dtype=torch.float16,
94
+ use_safetensors=True,
95
+ variant="fp16",
96
+ )
97
+ refiner.to("cuda")
98
+
99
+ # Define how many steps and what % of steps to be run on each experts (80/20) here
100
+ n_steps = 40
101
+ high_noise_frac = 0.8
102
+
103
+ prompt = "A majestic lion jumping from a big stone at night"
104
+
105
+ # run both experts
106
+ image = base(
107
+ prompt=prompt,
108
+ num_inference_steps=n_steps,
109
+ denoising_end=high_noise_frac,
110
+ output_type="latent",
111
+ ).images
112
+ image = refiner(
113
+ prompt=prompt,
114
+ num_inference_steps=n_steps,
115
+ denoising_start=high_noise_frac,
116
+ image=image,
117
+ ).images[0]
118
+ ```
119
+
120
  When using `torch >= 2.0`, you can improve the inference speed by 20-30% with torch.compile. Simple wrap the unet with torch compile before running the pipeline:
121
  ```py
122
  pipe.unet = torch.compile(pipe.unet, mode="reduce-overhead", fullgraph=True)
 
130
  + pipe.enable_model_cpu_offload()
131
  ```
132
 
133
+ For more information on how to use Stable Diffusion XL with `diffusers`, please have a look at [the Stable Diffusion XL Docs](https://huggingface.co/docs/diffusers/api/pipelines/stable_diffusion/stable_diffusion_xl).
134
+
135
+ ### Optimum
136
+ [Optimum](https://github.com/huggingface/optimum) provides a Stable Diffusion pipeline compatible with both [OpenVINO](https://docs.openvino.ai/latest/index.html) and [ONNX Runtime](https://onnxruntime.ai/).
137
+
138
+ #### OpenVINO
139
+
140
+ To install Optimum with the dependencies required for OpenVINO :
141
+
142
+ ```bash
143
+ pip install optimum[openvino]
144
+ ```
145
+
146
+ To load an OpenVINO model and run inference with OpenVINO Runtime, you need to replace `StableDiffusionXLPipeline` with Optimum `OVStableDiffusionXLPipeline`. In case you want to load a PyTorch model and convert it to the OpenVINO format on-the-fly, you can set `export=True`.
147
+
148
+ ```diff
149
+ - from diffusers import StableDiffusionPipeline
150
+ + from optimum.intel import OVStableDiffusionPipeline
151
+
152
+ model_id = "stabilityai/stable-diffusion-xl-base-1.0"
153
+ - pipeline = StableDiffusionPipeline.from_pretrained(model_id)
154
+ + pipeline = OVStableDiffusionPipeline.from_pretrained(model_id)
155
+ prompt = "A majestic lion jumping from a big stone at night"
156
+ image = pipeline(prompt).images[0]
157
+ ```
158
+
159
+ You can find more examples (such as static reshaping and model compilation) in optimum [documentation](https://huggingface.co/docs/optimum/main/en/intel/inference#stable-diffusion-xl).
160
+
161
+
162
+ #### ONNX
163
+
164
+ To install Optimum with the dependencies required for ONNX Runtime inference :
165
+
166
+ ```bash
167
+ pip install optimum[onnxruntime]
168
+ ```
169
+
170
+ To load an ONNX model and run inference with ONNX Runtime, you need to replace `StableDiffusionXLPipeline` with Optimum `ORTStableDiffusionXLPipeline`. In case you want to load a PyTorch model and convert it to the ONNX format on-the-fly, you can set `export=True`.
171
+
172
+ ```diff
173
+ - from diffusers import StableDiffusionPipeline
174
+ + from optimum.onnxruntime import ORTStableDiffusionPipeline
175
+
176
+ model_id = "stabilityai/stable-diffusion-xl-base-1.0"
177
+ - pipeline = StableDiffusionPipeline.from_pretrained(model_id)
178
+ + pipeline = ORTStableDiffusionPipeline.from_pretrained(model_id)
179
+ prompt = "A majestic lion jumping from a big stone at night"
180
+ image = pipeline(prompt).images[0]
181
+ ```
182
+
183
+ You can find more examples in optimum [documentation](https://huggingface.co/docs/optimum/main/en/onnxruntime/usage_guides/models#stable-diffusion-xl).
184
+
185
 
186
  ## Uses
187
 
 
212
  - The autoencoding part of the model is lossy.
213
 
214
  ### Bias
215
+ While the capabilities of image generation models are impressive, they can also reinforce or exacerbate social biases.
text_encoder/model.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:e27bafa0b3029ad637ef3ace24ce1efe85b8d0dbd22e03a2e70bda6fc88963a1
3
+ size 492587457
text_encoder/openvino_model.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:bbc78395c8cee553a17380e9b1a9a47da926c98731ba31306032d7d45fadb29b
3
+ size 492242672
text_encoder/openvino_model.xml ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ab5cf7327374d8c984f4e963564a329f92c9dad08dac9eee9b8dca86b912f1c9
3
+ size 1057789
text_encoder_2/model.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:162042ac6556e73f93d4172d4c67532c1cbe4dc7a6a8fa7e44dd2e3d7cbb772b
3
+ size 1041992
text_encoder_2/model.onnx_data ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:3da7ac65349fbd092e836e3eeca2c22811317bc804fd70af157b4550f2d4bcb5
3
+ size 2778639360
text_encoder_2/openvino_model.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:549d05154b0c09d46226f85abd48552f0ef999af4f24a95b3fb62d5e7d059570
3
+ size 2778640120
text_encoder_2/openvino_model.xml ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:38f0a4ff68dd918b24908a264140c2ad0e057eca82616f75c17cbf4a099ad6ad
3
+ size 2790191
unet/model.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:6f001c090fb13c0d0f8b0a5916da814712a94400b99471fabe77c1c4a51ecaaf
3
+ size 7293842
unet/model.onnx_data ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:7905b71f0044c5ea8fea8ca0451bd73cad53492ad50f964c49c3ff9250afa350
3
+ size 10269854720
unet/openvino_model.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:2d586bcb83c004ab07f5899adcac3d46189afe058d6a581570f0a613a010d9ec
3
+ size 10269856428
unet/openvino_model.xml ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:18955f96dffdba5612c4b554451f4ccc947c93e46df010173791654af4d0d7f6
3
+ size 22577438
vae_decoder/config.json ADDED
@@ -0,0 +1,31 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_class_name": "AutoencoderKL",
3
+ "_diffusers_version": "0.19.0.dev0",
4
+ "act_fn": "silu",
5
+ "block_out_channels": [
6
+ 128,
7
+ 256,
8
+ 512,
9
+ 512
10
+ ],
11
+ "down_block_types": [
12
+ "DownEncoderBlock2D",
13
+ "DownEncoderBlock2D",
14
+ "DownEncoderBlock2D",
15
+ "DownEncoderBlock2D"
16
+ ],
17
+ "force_upcast": true,
18
+ "in_channels": 3,
19
+ "latent_channels": 4,
20
+ "layers_per_block": 2,
21
+ "norm_num_groups": 32,
22
+ "out_channels": 3,
23
+ "sample_size": 1024,
24
+ "scaling_factor": 0.13025,
25
+ "up_block_types": [
26
+ "UpDecoderBlock2D",
27
+ "UpDecoderBlock2D",
28
+ "UpDecoderBlock2D",
29
+ "UpDecoderBlock2D"
30
+ ]
31
+ }
vae_decoder/model.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:0892c5e28b35791140467f7b9c9fa148c24238a5f0c381b1d4c22dcd2ed365cb
3
+ size 198093688
vae_decoder/openvino_model.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:34ea744ad1d75fb6b8825e31f5adbe7d62cbe2e7d061535b0a12e69c2f72d0f4
3
+ size 197961232
vae_decoder/openvino_model.xml ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:dd61f43e981282b77ecaecf5fc5c842d504932bae78ac99ec581cee50978b423
3
+ size 992181
vae_encoder/config.json ADDED
@@ -0,0 +1,31 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_class_name": "AutoencoderKL",
3
+ "_diffusers_version": "0.19.0.dev0",
4
+ "act_fn": "silu",
5
+ "block_out_channels": [
6
+ 128,
7
+ 256,
8
+ 512,
9
+ 512
10
+ ],
11
+ "down_block_types": [
12
+ "DownEncoderBlock2D",
13
+ "DownEncoderBlock2D",
14
+ "DownEncoderBlock2D",
15
+ "DownEncoderBlock2D"
16
+ ],
17
+ "force_upcast": true,
18
+ "in_channels": 3,
19
+ "latent_channels": 4,
20
+ "layers_per_block": 2,
21
+ "norm_num_groups": 32,
22
+ "out_channels": 3,
23
+ "sample_size": 1024,
24
+ "scaling_factor": 0.13025,
25
+ "up_block_types": [
26
+ "UpDecoderBlock2D",
27
+ "UpDecoderBlock2D",
28
+ "UpDecoderBlock2D",
29
+ "UpDecoderBlock2D"
30
+ ]
31
+ }
vae_encoder/model.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:7b117fbb21531efd59d68c95682392785999bf3e0c2ce95647c6e0de9af36e74
3
+ size 136775724
vae_encoder/openvino_model.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:97f04b0cf74808c7bd9b6e09f080e8cd24821943c3c06b153145989889215ce5
3
+ size 136655184
vae_encoder/openvino_model.xml ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:a3ec36b6f3f74d0cb2b005b7c0a1e5426c5ef1e7163b33e463ea57fa049c5996
3
+ size 849965