Diffusers documentation

Conditional Image Generation

You are viewing v0.10.2 version. A newer version v0.31.0 is available.
Hugging Face's logo
Join the Hugging Face community

and get access to the augmented documentation experience

to get started

Conditional Image Generation

The DiffusionPipeline is the easiest way to use a pre-trained diffusion system for inference

Start by creating an instance of DiffusionPipeline and specify which pipeline checkpoint you would like to download. You can use the DiffusionPipeline for any Diffusers’ checkpoint. In this guide though, you’ll use DiffusionPipeline for text-to-image generation with Latent Diffusion:

>>> from diffusers import DiffusionPipeline

>>> generator = DiffusionPipeline.from_pretrained("CompVis/ldm-text2im-large-256")

The DiffusionPipeline downloads and caches all modeling, tokenization, and scheduling components. Because the model consists of roughly 1.4 billion parameters, we strongly recommend running it on GPU. You can move the generator object to GPU, just like you would in PyTorch.

>>> generator.to("cuda")

Now you can use the generator on your text prompt:

>>> image = generator("An image of a squirrel in Picasso style").images[0]

The output is by default wrapped into a PIL Image object.

You can save the image by simply calling:

>>> image.save("image_of_squirrel_painting.png")