Improve image quality with deterministic generation
A common way to improve the quality of generated images is with deterministic batch generation, generate a batch of images and select one image to improve with a more detailed prompt in a second round of inference. The key is to pass a list of torch.Generator
’s to the pipeline for batched image generation, and tie each Generator
to a seed so you can reuse it for an image.
Let’s use runwayml/stable-diffusion-v1-5
for example, and generate several versions of the following prompt:
prompt = "Labrador in the style of Vermeer"
Instantiate a pipeline with DiffusionPipeline.from_pretrained() and place it on a GPU (if available):
>>> from diffusers import DiffusionPipeline
>>> pipe = DiffusionPipeline.from_pretrained(
... "runwayml/stable-diffusion-v1-5", torch_dtype=torch.float16, use_safetensors=True
... )
>>> pipe = pipe.to("cuda")
Now, define four different Generator
’s and assign each Generator
a seed (0
to 3
) so you can reuse a Generator
later for a specific image:
>>> import torch
>>> generator = [torch.Generator(device="cuda").manual_seed(i) for i in range(4)]
Generate the images and have a look:
>>> images = pipe(prompt, generator=generator, num_images_per_prompt=4).images
>>> images
In this example, you’ll improve upon the first image - but in reality, you can use any image you want (even the image with double sets of eyes!). The first image used the Generator
with seed 0
, so you’ll reuse that Generator
for the second round of inference. To improve the quality of the image, add some additional text to the prompt:
prompt = [prompt + t for t in [", highly realistic", ", artsy", ", trending", ", colorful"]]
generator = [torch.Generator(device="cuda").manual_seed(0) for i in range(4)]
Create four generators with seed 0
, and generate another batch of images, all of which should look like the first image from the previous round!
>>> images = pipe(prompt, generator=generator).images
>>> images