Diffusers

You are viewing v0.11.0 version. A newer version v0.35.1 is available.

Join the Hugging Face community

and get access to the augmented documentation experience

Collaborate on models, datasets and Spaces

Faster examples with accelerated inference

Switch between documentation themes

to get started

Text-Guided Image-to-Image Generation

The StableDiffusionDepth2ImgPipeline lets you pass a text prompt and an initial image to condition the generation of new images as well as a depth_map to preserve the images’ structure. If no depth_map is provided, the pipeline will automatically predict the depth via an integrated depth-estimation model.

import torch
import requests
from PIL import Image

from diffusers import StableDiffusionDepth2ImgPipeline

pipe = StableDiffusionDepth2ImgPipeline.from_pretrained(
    "stabilityai/stable-diffusion-2-depth",
    torch_dtype=torch.float16,
).to("cuda")


url = "http://images.cocodataset.org/val2017/000000039769.jpg"
init_image = Image.open(requests.get(url, stream=True).raw)
prompt = "two tigers"
n_prompt = "bad, deformed, ugly, bad anatomy"
image = pipe(prompt=prompt, image=init_image, negative_prompt=n_prompt, strength=0.7).images[0]

←Text-Guided Image-Inpainting Reusing seeds for deterministic generation→