Diffusers

You are viewing main version, which requires installation from source. If you'd like regular pip install, checkout the latest stable version (v0.34.0).

Join the Hugging Face community

and get access to the augmented documentation experience

Collaborate on models, datasets and Spaces

Faster examples with accelerated inference

Switch between documentation themes

to get started

Reproducible pipelines

Diffusion models are inherently random which is what allows it to generate different outputs every time it is run. But there are certain times when you want to generate the same output every time, like when you’re testing, replicating results, and even improving image quality. While you can’t expect to get identical results across platforms, you can expect reproducible results across releases and platforms within a certain tolerance range (though even this may vary).

This guide will show you how to control randomness for deterministic generation on a CPU and GPU.

We strongly recommend reading PyTorch’s statement about reproducibility:

“Completely reproducible results are not guaranteed across PyTorch releases, individual commits, or different platforms. Furthermore, results may not be reproducible between CPU and GPU executions, even when using identical seeds.”

Control randomness

During inference, pipelines rely heavily on random sampling operations which include creating the Gaussian noise tensors to denoise and adding noise to the scheduling step.

Take a look at the tensor values in the DDIMPipeline after two inference steps.

from diffusers import DDIMPipeline
import numpy as np

ddim = DDIMPipeline.from_pretrained( "google/ddpm-cifar10-32", use_safetensors=True)
image = ddim(num_inference_steps=2, output_type="np").images
print(np.abs(image).sum())

Running the code above prints one value, but if you run it again you get a different value.

Each time the pipeline is run, torch.randn uses a different random seed to create the Gaussian noise tensors. This leads to a different result each time it is run and enables the diffusion pipeline to generate a different random image each time.

But if you need to reliably generate the same image, that depends on whether you’re running the pipeline on a CPU or GPU.

It might seem unintuitive to pass Generator objects to a pipeline instead of the integer value representing the seed. However, this is the recommended design when working with probabilistic models in PyTorch because a Generator is a random state that can be passed to multiple pipelines in a sequence. As soon as the Generator is consumed, the state is changed in place which means even if you passed the same Generator to a different pipeline, it won’t produce the same result because the state is already changed.

CPU

GPU

Deterministic algorithms

You can also configure PyTorch to use deterministic algorithms to create a reproducible pipeline. The downside is that deterministic algorithms may be slower than non-deterministic ones and you may observe a decrease in performance.

Non-deterministic behavior occurs when operations are launched in more than one CUDA stream. To avoid this, set the environment variable CUBLAS_WORKSPACE_CONFIG to :16:8 to only use one buffer size during runtime.

PyTorch typically benchmarks multiple algorithms to select the fastest one, but if you want reproducibility, you should disable this feature because the benchmark may select different algorithms each time. Set Diffusers enable_full_determinism to enable deterministic algorithms.

enable_full_determinism()

Now when you run the same pipeline twice, you’ll get identical results.

import torch
from diffusers import DDIMScheduler, StableDiffusionPipeline

pipe = StableDiffusionPipeline.from_pretrained("stable-diffusion-v1-5/stable-diffusion-v1-5", use_safetensors=True).to("cuda")
pipe.scheduler = DDIMScheduler.from_config(pipe.scheduler.config)
g = torch.Generator(device="cuda")

prompt = "A bear is playing a guitar on Times Square"

g.manual_seed(0)
result1 = pipe(prompt=prompt, num_inference_steps=50, generator=g, output_type="latent").images

g.manual_seed(0)
result2 = pipe(prompt=prompt, num_inference_steps=50, generator=g, output_type="latent").images

print("L_inf dist =", abs(result1 - result2).max())
"L_inf dist = tensor(0., device='cuda:0')"

< > Update on GitHub

←Pipeline callbacks Controlling image quality→