openvino-model

#19

by echarlaix HF staff - opened Jul 27, 2023

base: refs/heads/main

←

from: refs/pr/19

Discussion Files changed

+211

-3

Files changed (20) hide show

README.md +98 -3
text_encoder/model.onnx +3 -0
text_encoder/openvino_model.bin +3 -0
text_encoder/openvino_model.xml +3 -0
text_encoder_2/model.onnx +3 -0
text_encoder_2/model.onnx_data +3 -0
text_encoder_2/openvino_model.bin +3 -0
text_encoder_2/openvino_model.xml +3 -0
unet/model.onnx +3 -0
unet/model.onnx_data +3 -0
unet/openvino_model.bin +3 -0
unet/openvino_model.xml +3 -0
vae_decoder/config.json +31 -0
vae_decoder/model.onnx +3 -0
vae_decoder/openvino_model.bin +3 -0
vae_decoder/openvino_model.xml +3 -0
vae_encoder/config.json +31 -0
vae_encoder/model.onnx +3 -0
vae_encoder/openvino_model.bin +3 -0
vae_encoder/openvino_model.xml +3 -0

README.md CHANGED Viewed

@@ -48,7 +48,7 @@ The SDXL base model performs significantly better than the previous variants, an
 ### 🧨 Diffusers
-Make sure to upgrade diffusers to >= 0.18.0:
 ```
 pip install diffusers --upgrade
 ```
@@ -58,7 +58,8 @@ In addition make sure to install `transformers`, `safetensors`, `accelerate` as
 pip install invisible_watermark transformers accelerate safetensors
 ```
-You can use the model then as follows
 ```py
 from diffusers import DiffusionPipeline
 import torch
@@ -74,6 +75,48 @@ prompt = "An astronaut riding a green horse"
 images = pipe(prompt=prompt).images[0]
 ```
 When using `torch >= 2.0`, you can improve the inference speed by 20-30% with torch.compile. Simple wrap the unet with torch compile before running the pipeline:
 ```py
 pipe.unet = torch.compile(pipe.unet, mode="reduce-overhead", fullgraph=True)
@@ -87,6 +130,58 @@ instead of `.to("cuda")`:
 + pipe.enable_model_cpu_offload()
 ```
 ## Uses
@@ -117,4 +212,4 @@ The model was not trained to be factual or true representations of people or eve
 - The autoencoding part of the model is lossy.
 ### Bias
-While the capabilities of image generation models are impressive, they can also reinforce or exacerbate social biases.

 ### 🧨 Diffusers
+Make sure to upgrade diffusers to >= 0.19.0:
 ```
 pip install diffusers --upgrade
 ```
 pip install invisible_watermark transformers accelerate safetensors
 ```
+To just use the base model, you can run:
 ```py
 from diffusers import DiffusionPipeline
 import torch
 images = pipe(prompt=prompt).images[0]
 ```
+To use the whole base + refiner pipeline as an ensemble of experts you can run:
+```py
+from diffusers import DiffusionPipeline
+import torch
+# load both base & refiner
+base = DiffusionPipeline.from_pretrained(
+    "stabilityai/stable-diffusion-xl-base-1.0", torch_dtype=torch.float16, variant="fp16", use_safetensors=True
+)
+base.to("cuda")
+refiner = DiffusionPipeline.from_pretrained(
+    "stabilityai/stable-diffusion-xl-refiner-1.0",
+    text_encoder_2=base.text_encoder_2,
+    vae=base.vae,
+    torch_dtype=torch.float16,
+    use_safetensors=True,
+    variant="fp16",
+)
+refiner.to("cuda")
+# Define how many steps and what % of steps to be run on each experts (80/20) here
+n_steps = 40
+high_noise_frac = 0.8
+prompt = "A majestic lion jumping from a big stone at night"
+# run both experts
+image = base(
+    prompt=prompt,
+    num_inference_steps=n_steps,
+    denoising_end=high_noise_frac,
+    output_type="latent",
+).images
+image = refiner(
+    prompt=prompt,
+    num_inference_steps=n_steps,
+    denoising_start=high_noise_frac,
+    image=image,
+).images[0]
+```
 When using `torch >= 2.0`, you can improve the inference speed by 20-30% with torch.compile. Simple wrap the unet with torch compile before running the pipeline:
 ```py
 pipe.unet = torch.compile(pipe.unet, mode="reduce-overhead", fullgraph=True)
 + pipe.enable_model_cpu_offload()
 ```
+For more information on how to use Stable Diffusion XL with `diffusers`, please have a look at [the Stable Diffusion XL Docs](https://huggingface.co/docs/diffusers/api/pipelines/stable_diffusion/stable_diffusion_xl).
+### Optimum
+[Optimum](https://github.com/huggingface/optimum) provides a Stable Diffusion pipeline compatible with both [OpenVINO](https://docs.openvino.ai/latest/index.html) and [ONNX Runtime](https://onnxruntime.ai/).
+#### OpenVINO
+To install Optimum with the dependencies required for OpenVINO :
+```bash
+pip install optimum[openvino]
+```
+To load an OpenVINO model and run inference with OpenVINO Runtime, you need to replace `StableDiffusionXLPipeline` with Optimum `OVStableDiffusionXLPipeline`. In case you want to load a PyTorch model and convert it to the OpenVINO format on-the-fly, you can set `export=True`.
+```diff
+- from diffusers import StableDiffusionPipeline
++ from optimum.intel import OVStableDiffusionPipeline
+model_id = "stabilityai/stable-diffusion-xl-base-1.0"
+- pipeline = StableDiffusionPipeline.from_pretrained(model_id)
++ pipeline = OVStableDiffusionPipeline.from_pretrained(model_id)
+prompt = "A majestic lion jumping from a big stone at night"
+image = pipeline(prompt).images[0]
+```
+You can find more examples (such as static reshaping and model compilation) in optimum [documentation](https://huggingface.co/docs/optimum/main/en/intel/inference#stable-diffusion-xl).
+#### ONNX
+To install Optimum with the dependencies required for ONNX Runtime inference :
+```bash
+pip install optimum[onnxruntime]
+```
+To load an ONNX model and run inference with ONNX Runtime, you need to replace `StableDiffusionXLPipeline` with Optimum `ORTStableDiffusionXLPipeline`. In case you want to load a PyTorch model and convert it to the ONNX format on-the-fly, you can set `export=True`.
+```diff
+- from diffusers import StableDiffusionPipeline
++ from optimum.onnxruntime import ORTStableDiffusionPipeline
+model_id = "stabilityai/stable-diffusion-xl-base-1.0"
+- pipeline = StableDiffusionPipeline.from_pretrained(model_id)
++ pipeline = ORTStableDiffusionPipeline.from_pretrained(model_id)
+prompt = "A majestic lion jumping from a big stone at night"
+image = pipeline(prompt).images[0]
+```
+You can find more examples in optimum [documentation](https://huggingface.co/docs/optimum/main/en/onnxruntime/usage_guides/models#stable-diffusion-xl).
 ## Uses
 - The autoencoding part of the model is lossy.
 ### Bias
+While the capabilities of image generation models are impressive, they can also reinforce or exacerbate social biases.

text_encoder/model.onnx ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:e27bafa0b3029ad637ef3ace24ce1efe85b8d0dbd22e03a2e70bda6fc88963a1
+size 492587457

text_encoder/openvino_model.bin ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:bbc78395c8cee553a17380e9b1a9a47da926c98731ba31306032d7d45fadb29b
+size 492242672

text_encoder/openvino_model.xml ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:ab5cf7327374d8c984f4e963564a329f92c9dad08dac9eee9b8dca86b912f1c9
+size 1057789

text_encoder_2/model.onnx ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:162042ac6556e73f93d4172d4c67532c1cbe4dc7a6a8fa7e44dd2e3d7cbb772b
+size 1041992

text_encoder_2/model.onnx_data ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:3da7ac65349fbd092e836e3eeca2c22811317bc804fd70af157b4550f2d4bcb5
+size 2778639360

text_encoder_2/openvino_model.bin ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:549d05154b0c09d46226f85abd48552f0ef999af4f24a95b3fb62d5e7d059570
+size 2778640120

text_encoder_2/openvino_model.xml ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:38f0a4ff68dd918b24908a264140c2ad0e057eca82616f75c17cbf4a099ad6ad
+size 2790191

unet/model.onnx ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:6f001c090fb13c0d0f8b0a5916da814712a94400b99471fabe77c1c4a51ecaaf
+size 7293842

unet/model.onnx_data ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:7905b71f0044c5ea8fea8ca0451bd73cad53492ad50f964c49c3ff9250afa350
+size 10269854720

unet/openvino_model.bin ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:2d586bcb83c004ab07f5899adcac3d46189afe058d6a581570f0a613a010d9ec
+size 10269856428

unet/openvino_model.xml ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:18955f96dffdba5612c4b554451f4ccc947c93e46df010173791654af4d0d7f6
+size 22577438

vae_decoder/config.json ADDED Viewed

	@@ -0,0 +1,31 @@

+{
+  "_class_name": "AutoencoderKL",
+  "_diffusers_version": "0.19.0.dev0",
+  "act_fn": "silu",
+  "block_out_channels": [
+    128,
+    256,
+    512,
+    512
+  ],
+  "down_block_types": [
+    "DownEncoderBlock2D",
+    "DownEncoderBlock2D",
+    "DownEncoderBlock2D",
+    "DownEncoderBlock2D"
+  ],
+  "force_upcast": true,
+  "in_channels": 3,
+  "latent_channels": 4,
+  "layers_per_block": 2,
+  "norm_num_groups": 32,
+  "out_channels": 3,
+  "sample_size": 1024,
+  "scaling_factor": 0.13025,
+  "up_block_types": [
+    "UpDecoderBlock2D",
+    "UpDecoderBlock2D",
+    "UpDecoderBlock2D",
+    "UpDecoderBlock2D"
+  ]
+}

vae_decoder/model.onnx ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:0892c5e28b35791140467f7b9c9fa148c24238a5f0c381b1d4c22dcd2ed365cb
+size 198093688

vae_decoder/openvino_model.bin ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:34ea744ad1d75fb6b8825e31f5adbe7d62cbe2e7d061535b0a12e69c2f72d0f4
+size 197961232

vae_decoder/openvino_model.xml ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:dd61f43e981282b77ecaecf5fc5c842d504932bae78ac99ec581cee50978b423
+size 992181

vae_encoder/config.json ADDED Viewed

	@@ -0,0 +1,31 @@

+{
+  "_class_name": "AutoencoderKL",
+  "_diffusers_version": "0.19.0.dev0",
+  "act_fn": "silu",
+  "block_out_channels": [
+    128,
+    256,
+    512,
+    512
+  ],
+  "down_block_types": [
+    "DownEncoderBlock2D",
+    "DownEncoderBlock2D",
+    "DownEncoderBlock2D",
+    "DownEncoderBlock2D"
+  ],
+  "force_upcast": true,
+  "in_channels": 3,
+  "latent_channels": 4,
+  "layers_per_block": 2,
+  "norm_num_groups": 32,
+  "out_channels": 3,
+  "sample_size": 1024,
+  "scaling_factor": 0.13025,
+  "up_block_types": [
+    "UpDecoderBlock2D",
+    "UpDecoderBlock2D",
+    "UpDecoderBlock2D",
+    "UpDecoderBlock2D"
+  ]
+}

vae_encoder/model.onnx ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:7b117fbb21531efd59d68c95682392785999bf3e0c2ce95647c6e0de9af36e74
+size 136775724

vae_encoder/openvino_model.bin ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:97f04b0cf74808c7bd9b6e09f080e8cd24821943c3c06b153145989889215ce5
+size 136655184

vae_encoder/openvino_model.xml ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:a3ec36b6f3f74d0cb2b005b7c0a1e5426c5ef1e7163b33e463ea57fa049c5996
+size 849965