Stable Diffusion
Stable Diffusion is a text-to-image latent diffusion model. Check out this blog post for more information.
How to generate images?
To generate images with Stable Diffusion on Gaudi, you need to instantiate two instances:
- A pipeline with
GaudiStableDiffusionPipeline
. This pipeline supports text-to-image generation. - A scheduler with
GaudiDDIMScheduler
. This scheduler has been optimized for Gaudi.
When initializing the pipeline, you have to specify use_habana=True
to deploy it on HPUs.
Furthermore, to get the fastest possible generations you should enable HPU graphs with use_hpu_graphs=True
.
Finally, you will need to specify a Gaudi configuration which can be downloaded from the Hugging Face Hub.
from optimum.habana.diffusers import GaudiDDIMScheduler, GaudiStableDiffusionPipeline
model_name = "runwayml/stable-diffusion-v1-5"
scheduler = GaudiDDIMScheduler.from_pretrained(model_name, subfolder="scheduler")
pipeline = GaudiStableDiffusionPipeline.from_pretrained(
model_name,
scheduler=scheduler,
use_habana=True,
use_hpu_graphs=True,
gaudi_config="Habana/stable-diffusion",
)
You can then call the pipeline to generate images from one or several prompts:
outputs = pipeline(
prompt=["High quality photo of an astronaut riding a horse in space", "Face of a yellow cat, high resolution, sitting on a park bench"],
num_images_per_prompt=10,
batch_size=4,
output_type="pil",
)
Outputs can be PIL images or Numpy arrays. See here all the parameters you can set to tailor generations to your taste.
Check out the example provided in the official Github repository.
Stable Diffusion 2
Stable Diffusion 2 can be used with the exact same classes. Here is an example:
from optimum.habana.diffusers import GaudiDDIMScheduler, GaudiStableDiffusionPipeline
model_name = "stabilityai/stable-diffusion-2-1"
scheduler = GaudiDDIMScheduler.from_pretrained(model_name, subfolder="scheduler")
pipeline = GaudiStableDiffusionPipeline.from_pretrained(
model_name,
scheduler=scheduler,
use_habana=True,
use_hpu_graphs=True,
gaudi_config="Habana/stable-diffusion-2",
)
outputs = pipeline(
["An image of a squirrel in Picasso style"],
num_images_per_prompt=10,
batch_size=2,
height=768,
width=768,
)
There are two different checkpoints for Stable Diffusion 2:
- use stabilityai/stable-diffusion-2-1 for generating 768x768 images
- use stabilityai/stable-diffusion-2-1-base for generating 512x512 images
Tips
To accelerate your Stable Diffusion pipeline, you can run it in full bfloat16 precision.
This will also save memory.
You just need to pass torch_dtype=torch.bfloat16
to from_pretrained
when instantiating your pipeline.
Here is how to do it:
import torch
pipeline = GaudiStableDiffusionPipeline.from_pretrained(
"runwayml/stable-diffusion-v1-5",
scheduler=scheduler,
use_habana=True,
use_hpu_graphs=True,
gaudi_config="Habana/stable-diffusion",
torch_dtype=torch.bfloat16
)