patrickvonplaten
commited on
Commit
•
3c5bc0c
1
Parent(s):
73eda7e
Update README.md
Browse files
README.md
CHANGED
@@ -24,10 +24,50 @@ Stable Diffusion Inpainting is a latent text-to-image diffusion model capable of
|
|
24 |
|
25 |
The **Stable-Diffusion-Inpainting** was initialized with the weights of the [Stable-Diffusion-v-1-2](https://steps/huggingface.co/CompVis/stable-diffusion-v-1-2-original). First 595k steps regular training, then 440k steps of inpainting training at resolution 512x512 on “laion-aesthetics v2 5+” and 10% dropping of the text-conditioning to improve classifier-free [classifier-free guidance sampling](https://arxiv.org/abs/2207.12598). For inpainting, the UNet has 5 additional input channels (4 for the encoded masked-image and 1 for the mask itself) whose weights were zero-initialized after restoring the non-inpainting checkpoint. During training, we generate synthetic masks and in 25% mask everything.
|
26 |
|
27 |
-
|
28 |
-
- [sd-v1-5-inpainting.ckpt](https://huggingface.co/runwayml/stable-diffusion-inpainting/resolve/main/sd-v1-5-inpainting.ckpt)
|
29 |
|
30 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
31 |
|
32 |
## Model Details
|
33 |
- **Developed by:** Robin Rombach, Patrick Esser
|
24 |
|
25 |
The **Stable-Diffusion-Inpainting** was initialized with the weights of the [Stable-Diffusion-v-1-2](https://steps/huggingface.co/CompVis/stable-diffusion-v-1-2-original). First 595k steps regular training, then 440k steps of inpainting training at resolution 512x512 on “laion-aesthetics v2 5+” and 10% dropping of the text-conditioning to improve classifier-free [classifier-free guidance sampling](https://arxiv.org/abs/2207.12598). For inpainting, the UNet has 5 additional input channels (4 for the encoded masked-image and 1 for the mask itself) whose weights were zero-initialized after restoring the non-inpainting checkpoint. During training, we generate synthetic masks and in 25% mask everything.
|
26 |
|
27 |
+
## Examples:
|
|
|
28 |
|
29 |
+
You can use this repository both with the 🧨Diffusers library and the original [GitHub repository](https://github.com/runwayml/stable-diffusion).
|
30 |
+
|
31 |
+
### Diffusers
|
32 |
+
|
33 |
+
```python
|
34 |
+
from io import BytesIO
|
35 |
+
import torch
|
36 |
+
import PIL
|
37 |
+
import requests
|
38 |
+
from diffusers import StableDiffusionInpaintPipeline
|
39 |
+
|
40 |
+
def download_image(url):
|
41 |
+
response = requests.get(url)
|
42 |
+
return PIL.Image.open(BytesIO(response.content)).convert("RGB")
|
43 |
+
|
44 |
+
image = download_image("https://raw.githubusercontent.com/CompVis/latent-diffusion/main/data/inpainting_examples/overture-creations-5sI6fQgYIuo.png")
|
45 |
+
image = image.resize((512, 512))
|
46 |
+
mask_image = download_image("https://raw.githubusercontent.com/CompVis/latent-diffusion/main/data/inpainting_examples/overture-creations-5sI6fQgYIuo_mask.png")
|
47 |
+
mask_image = mask_image.resize((512, 512))
|
48 |
+
|
49 |
+
pipe = StableDiffusionInpaintPipeline.from_pretrained(
|
50 |
+
"runwayml/stable-diffusion-inpainting",
|
51 |
+
revision="fp16",
|
52 |
+
torch_dtype=torch.float16,
|
53 |
+
)
|
54 |
+
pipe.to("cuda").enable_attention_slicing()
|
55 |
+
|
56 |
+
prompt = "Face of a yellow cat, high resolution, sitting on a park bench"
|
57 |
+
|
58 |
+
image = pipe(prompt=prompt, image=image, mask_image=mask_image).images[0]
|
59 |
+
image.save("./yellow_cat_on_park_bench.png")
|
60 |
+
```
|
61 |
+
|
62 |
+
**How it works:**
|
63 |
+
`image` | `mask_image` | `prompt` | | **Output** |
|
64 |
+
:-------------------------:|:-------------------------:|:-------------------------:|:-------------------------:|-------------------------:|
|
65 |
+
<img src="https://raw.githubusercontent.com/CompVis/latent-diffusion/main/data/inpainting_examples/overture-creations-5sI6fQgYIuo.png" alt="drawing" width="100"/> | <img src="https://raw.githubusercontent.com/CompVis/latent-diffusion/main/data/inpainting_examples/overture-creations-5sI6fQgYIuo_mask.png" alt="drawing" width="100"/> | ***Face of a yellow cat, high resolution, sitting on a park bench*** | **=>** | <img src="https://huggingface.co/datasets/patrickvonplaten/images/resolve/main/test.png" alt="drawing" width="100"/> |
|
66 |
+
|
67 |
+
### Original GitHub Repository
|
68 |
+
|
69 |
+
1. Download the weights [sd-v1-5-inpainting.ckpt](https://huggingface.co/runwayml/stable-diffusion-inpainting/resolve/main/sd-v1-5-inpainting.ckpt)
|
70 |
+
2. Follow instructions [here](https://github.com/runwayml/stable-diffusion#inpainting-with-stable-diffusion).
|
71 |
|
72 |
## Model Details
|
73 |
- **Developed by:** Robin Rombach, Patrick Esser
|