A common way to improve the quality of generated images is with deterministic batch generation, generate a batch of images and select one image to improve with a more detailed prompt in a second round of inference. The key is to pass a list of
torch.Generator’s to the pipeline for batched image generation, and tie each
Generator to a seed so you can reuse it for an image.
runwayml/stable-diffusion-v1-5 for example, and generate several versions of the following prompt:
prompt = "Labrador in the style of Vermeer"
Instantiate a pipeline with DiffusionPipeline.from_pretrained() and place it on a GPU (if available):
from diffusers import DiffusionPipeline pipe = DiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v1-5", torch_dtype=torch.float16) pipe = pipe.to("cuda")
Now, define four different
Generator’s and assign each
Generator a seed (
3) so you can reuse a
Generator later for a specific image:
import torch generator = [torch.Generator(device="cuda").manual_seed(i) for i in range(4)]
Generate the images and have a look:
4).images imagesimages = pipe(prompt, generator=generator, num_images_per_prompt=
In this example, you’ll improve upon the first image - but in reality, you can use any image you want (even the image with double sets of eyes!). The first image used the
Generator with seed
0, so you’ll reuse that
Generator for the second round of inference. To improve the quality of the image, add some additional text to the prompt:
prompt = [prompt + t for t in [", highly realistic", ", artsy", ", trending", ", colorful"]] generator = [torch.Generator(device="cuda").manual_seed(0) for i in range(4)]
Create four generators with seed
0, and generate another batch of images, all of which should look like the first image from the previous round!
images = pipe(prompt, generator=generator).images images