Diffusers documentation

Habana Gaudi

You are viewing main version, which requires installation from source. If you'd like regular pip install, checkout the latest stable version (v0.27.2).
Hugging Face's logo
Join the Hugging Face community

and get access to the augmented documentation experience

to get started

Habana Gaudi

🤗 Diffusers is compatible with Habana Gaudi through 🤗 Optimum. Follow the installation guide to install the SynapseAI and Gaudi drivers, and then install Optimum Habana:

python -m pip install --upgrade-strategy eager optimum[habana]

To generate images with Stable Diffusion 1 and 2 on Gaudi, you need to instantiate two instances:

  • GaudiStableDiffusionPipeline, a pipeline for text-to-image generation.
  • GaudiDDIMScheduler, a Gaudi-optimized scheduler.

When you initialize the pipeline, you have to specify use_habana=True to deploy it on HPUs and to get the fastest possible generation, you should enable HPU graphs with use_hpu_graphs=True.

Finally, specify a GaudiConfig which can be downloaded from the Habana organization on the Hub.

from optimum.habana import GaudiConfig
from optimum.habana.diffusers import GaudiDDIMScheduler, GaudiStableDiffusionPipeline

model_name = "stabilityai/stable-diffusion-2-base"
scheduler = GaudiDDIMScheduler.from_pretrained(model_name, subfolder="scheduler")
pipeline = GaudiStableDiffusionPipeline.from_pretrained(

Now you can call the pipeline to generate images by batches from one or several prompts:

outputs = pipeline(
        "High quality photo of an astronaut riding a horse in space",
        "Face of a yellow cat, high resolution, sitting on a park bench",

For more information, check out 🤗 Optimum Habana’s documentation and the example provided in the official GitHub repository.


We benchmarked Habana’s first-generation Gaudi and Gaudi2 with the Habana/stable-diffusion and Habana/stable-diffusion-2 Gaudi configurations (mixed precision bf16/fp32) to demonstrate their performance.

For Stable Diffusion v1.5 on 512x512 images:

Latency (batch size = 1) Throughput
first-generation Gaudi 3.80s 0.308 images/s (batch size = 8)
Gaudi2 1.33s 1.081 images/s (batch size = 8)

For Stable Diffusion v2.1 on 768x768 images:

Latency (batch size = 1) Throughput
first-generation Gaudi 10.2s 0.108 images/s (batch size = 4)
Gaudi2 3.17s 0.379 images/s (batch size = 8)
< > Update on GitHub