Commit
•
30ee9cf
1
Parent(s):
3c5bc0c
Update README.md
Browse files
README.md
CHANGED
@@ -2,7 +2,9 @@
|
|
2 |
license: creativeml-openrail-m
|
3 |
tags:
|
4 |
- stable-diffusion
|
|
|
5 |
- text-to-image
|
|
|
6 |
library_name: "stable-diffusion"
|
7 |
extra_gated_prompt: |-
|
8 |
One more step before getting this model.
|
@@ -24,45 +26,38 @@ Stable Diffusion Inpainting is a latent text-to-image diffusion model capable of
|
|
24 |
|
25 |
The **Stable-Diffusion-Inpainting** was initialized with the weights of the [Stable-Diffusion-v-1-2](https://steps/huggingface.co/CompVis/stable-diffusion-v-1-2-original). First 595k steps regular training, then 440k steps of inpainting training at resolution 512x512 on “laion-aesthetics v2 5+” and 10% dropping of the text-conditioning to improve classifier-free [classifier-free guidance sampling](https://arxiv.org/abs/2207.12598). For inpainting, the UNet has 5 additional input channels (4 for the encoded masked-image and 1 for the mask itself) whose weights were zero-initialized after restoring the non-inpainting checkpoint. During training, we generate synthetic masks and in 25% mask everything.
|
26 |
|
|
|
|
|
|
|
27 |
## Examples:
|
28 |
|
29 |
-
You can use this
|
30 |
|
31 |
### Diffusers
|
32 |
|
33 |
```python
|
34 |
-
from io import BytesIO
|
35 |
-
import torch
|
36 |
-
import PIL
|
37 |
-
import requests
|
38 |
from diffusers import StableDiffusionInpaintPipeline
|
39 |
|
40 |
-
def download_image(url):
|
41 |
-
response = requests.get(url)
|
42 |
-
return PIL.Image.open(BytesIO(response.content)).convert("RGB")
|
43 |
-
|
44 |
-
image = download_image("https://raw.githubusercontent.com/CompVis/latent-diffusion/main/data/inpainting_examples/overture-creations-5sI6fQgYIuo.png")
|
45 |
-
image = image.resize((512, 512))
|
46 |
-
mask_image = download_image("https://raw.githubusercontent.com/CompVis/latent-diffusion/main/data/inpainting_examples/overture-creations-5sI6fQgYIuo_mask.png")
|
47 |
-
mask_image = mask_image.resize((512, 512))
|
48 |
-
|
49 |
pipe = StableDiffusionInpaintPipeline.from_pretrained(
|
50 |
"runwayml/stable-diffusion-inpainting",
|
51 |
revision="fp16",
|
52 |
torch_dtype=torch.float16,
|
53 |
)
|
54 |
-
pipe.to("cuda").enable_attention_slicing()
|
55 |
-
|
56 |
prompt = "Face of a yellow cat, high resolution, sitting on a park bench"
|
57 |
-
|
58 |
image = pipe(prompt=prompt, image=image, mask_image=mask_image).images[0]
|
59 |
image.save("./yellow_cat_on_park_bench.png")
|
60 |
```
|
61 |
|
62 |
**How it works:**
|
63 |
-
`image` | `mask_image`
|
64 |
-
|
65 |
-
<img src="https://raw.githubusercontent.com/CompVis/latent-diffusion/main/data/inpainting_examples/overture-creations-5sI6fQgYIuo.png" alt="drawing" width="
|
|
|
|
|
|
|
|
|
|
|
66 |
|
67 |
### Original GitHub Repository
|
68 |
|
2 |
license: creativeml-openrail-m
|
3 |
tags:
|
4 |
- stable-diffusion
|
5 |
+
- stable-diffusion-diffusers
|
6 |
- text-to-image
|
7 |
+
inference: false
|
8 |
library_name: "stable-diffusion"
|
9 |
extra_gated_prompt: |-
|
10 |
One more step before getting this model.
|
26 |
|
27 |
The **Stable-Diffusion-Inpainting** was initialized with the weights of the [Stable-Diffusion-v-1-2](https://steps/huggingface.co/CompVis/stable-diffusion-v-1-2-original). First 595k steps regular training, then 440k steps of inpainting training at resolution 512x512 on “laion-aesthetics v2 5+” and 10% dropping of the text-conditioning to improve classifier-free [classifier-free guidance sampling](https://arxiv.org/abs/2207.12598). For inpainting, the UNet has 5 additional input channels (4 for the encoded masked-image and 1 for the mask itself) whose weights were zero-initialized after restoring the non-inpainting checkpoint. During training, we generate synthetic masks and in 25% mask everything.
|
28 |
|
29 |
+
[![Open In Spaces](https://camo.githubusercontent.com/00380c35e60d6b04be65d3d94a58332be5cc93779f630bcdfc18ab9a3a7d3388/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f25463025394625413425393725323048756767696e67253230466163652d5370616365732d626c7565)](https://huggingface.co/spaces/runwayml/stable-diffusion-inpainting)
|
30 |
+
[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/notebooks/blob/main/diffusers/in_painting_with_stable_diffusion_using_diffusers.ipynb)
|
31 |
+
|
32 |
## Examples:
|
33 |
|
34 |
+
You can use this both with the [🧨Diffusers library](https://github.com/huggingface/diffusers) and the [RunwayML GitHub repository](https://github.com/runwayml/stable-diffusion).
|
35 |
|
36 |
### Diffusers
|
37 |
|
38 |
```python
|
|
|
|
|
|
|
|
|
39 |
from diffusers import StableDiffusionInpaintPipeline
|
40 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
41 |
pipe = StableDiffusionInpaintPipeline.from_pretrained(
|
42 |
"runwayml/stable-diffusion-inpainting",
|
43 |
revision="fp16",
|
44 |
torch_dtype=torch.float16,
|
45 |
)
|
|
|
|
|
46 |
prompt = "Face of a yellow cat, high resolution, sitting on a park bench"
|
47 |
+
#image and mask_image should be PIL images. The mask structure is white for inpainting and black for keeping as is
|
48 |
image = pipe(prompt=prompt, image=image, mask_image=mask_image).images[0]
|
49 |
image.save("./yellow_cat_on_park_bench.png")
|
50 |
```
|
51 |
|
52 |
**How it works:**
|
53 |
+
`image` | `mask_image`
|
54 |
+
:-------------------------:|:-------------------------:|
|
55 |
+
<img src="https://raw.githubusercontent.com/CompVis/latent-diffusion/main/data/inpainting_examples/overture-creations-5sI6fQgYIuo.png" alt="drawing" width="300"/> | <img src="https://raw.githubusercontent.com/CompVis/latent-diffusion/main/data/inpainting_examples/overture-creations-5sI6fQgYIuo_mask.png" alt="drawing" width="300"/>
|
56 |
+
|
57 |
+
|
58 |
+
`prompt` | `Output`
|
59 |
+
:-------------------------:|:-------------------------:|
|
60 |
+
<span style="position: relative;bottom: 150px;">Face of a yellow cat, high resolution, sitting on a park bench</span> | <img src="https://huggingface.co/datasets/patrickvonplaten/images/resolve/main/test.png" alt="drawing" width="300"/>
|
61 |
|
62 |
### Original GitHub Repository
|
63 |
|