runwayml
/

stable-diffusion-inpainting

Text-to-Image Diffusers StableDiffusionInpaintPipeline stable-diffusion stable-diffusion-diffusers 5 papers

Model card Files Files and versions Community

patrickvonplaten commited on Oct 19, 2022

Commit

3c5bc0c

•

1 Parent(s): 73eda7e

Update README.md

Browse files

Files changed (1) hide show

README.md +43 -3

README.md CHANGED Viewed

@@ -24,10 +24,50 @@ Stable Diffusion Inpainting is a latent text-to-image diffusion model capable of
 The **Stable-Diffusion-Inpainting** was initialized with the weights of the [Stable-Diffusion-v-1-2](https://steps/huggingface.co/CompVis/stable-diffusion-v-1-2-original). First 595k steps regular training, then 440k steps of inpainting training at resolution 512x512 on “laion-aesthetics v2 5+” and 10% dropping of the text-conditioning to improve classifier-free [classifier-free guidance sampling](https://arxiv.org/abs/2207.12598). For inpainting, the UNet has 5 additional input channels (4 for the encoded masked-image and 1 for the mask itself) whose weights were zero-initialized after restoring the non-inpainting checkpoint. During training, we generate synthetic masks and in 25% mask everything.
-#### Download the weights
-- [sd-v1-5-inpainting.ckpt](https://huggingface.co/runwayml/stable-diffusion-inpainting/resolve/main/sd-v1-5-inpainting.ckpt)
-These weights are intended to be used with the original [RunwayML Stable Diffusion codebase](https://github.com/runwayml/stable-diffusion). If you are looking for the model to use with the 🧨Diffusers library, it is coming soon.
 ## Model Details
 - **Developed by:** Robin Rombach, Patrick Esser

 The **Stable-Diffusion-Inpainting** was initialized with the weights of the [Stable-Diffusion-v-1-2](https://steps/huggingface.co/CompVis/stable-diffusion-v-1-2-original). First 595k steps regular training, then 440k steps of inpainting training at resolution 512x512 on “laion-aesthetics v2 5+” and 10% dropping of the text-conditioning to improve classifier-free [classifier-free guidance sampling](https://arxiv.org/abs/2207.12598). For inpainting, the UNet has 5 additional input channels (4 for the encoded masked-image and 1 for the mask itself) whose weights were zero-initialized after restoring the non-inpainting checkpoint. During training, we generate synthetic masks and in 25% mask everything.
+## Examples:
+You can use this repository both with the 🧨Diffusers library and the original [GitHub repository](https://github.com/runwayml/stable-diffusion).
+### Diffusers
+```python
+from io import BytesIO
+import torch
+import PIL
+import requests
+from diffusers import StableDiffusionInpaintPipeline
+def download_image(url):
+    response = requests.get(url)
+    return PIL.Image.open(BytesIO(response.content)).convert("RGB")
+image = download_image("https://raw.githubusercontent.com/CompVis/latent-diffusion/main/data/inpainting_examples/overture-creations-5sI6fQgYIuo.png")
+image = image.resize((512, 512))
+mask_image = download_image("https://raw.githubusercontent.com/CompVis/latent-diffusion/main/data/inpainting_examples/overture-creations-5sI6fQgYIuo_mask.png")
+mask_image = mask_image.resize((512, 512))
+pipe = StableDiffusionInpaintPipeline.from_pretrained(
+    "runwayml/stable-diffusion-inpainting",
+    revision="fp16",
+    torch_dtype=torch.float16,
+)
+pipe.to("cuda").enable_attention_slicing()
+prompt = "Face of a yellow cat, high resolution, sitting on a park bench"
+image = pipe(prompt=prompt, image=image, mask_image=mask_image).images[0]
+image.save("./yellow_cat_on_park_bench.png")
+```
+**How it works:**
+`image`          | `mask_image` | `prompt` |  | **Output** |
+:-------------------------:|:-------------------------:|:-------------------------:|:-------------------------:|-------------------------:|
+<img src="https://raw.githubusercontent.com/CompVis/latent-diffusion/main/data/inpainting_examples/overture-creations-5sI6fQgYIuo.png" alt="drawing" width="100"/> | <img src="https://raw.githubusercontent.com/CompVis/latent-diffusion/main/data/inpainting_examples/overture-creations-5sI6fQgYIuo_mask.png" alt="drawing" width="100"/> | ***Face of a yellow cat, high resolution, sitting on a park bench*** | **=>** | <img src="https://huggingface.co/datasets/patrickvonplaten/images/resolve/main/test.png" alt="drawing" width="100"/> |
+### Original GitHub Repository
+1. Download the weights [sd-v1-5-inpainting.ckpt](https://huggingface.co/runwayml/stable-diffusion-inpainting/resolve/main/sd-v1-5-inpainting.ckpt)
+2. Follow instructions [here](https://github.com/runwayml/stable-diffusion#inpainting-with-stable-diffusion).
 ## Model Details
 - **Developed by:** Robin Rombach, Patrick Esser