Update readme (#2)

- Update readme (262f19b66e5c64f7ce35b250fe0c0560aa4650d9)

Co-authored-by: Takuma Mori <takuma104@users.noreply.huggingface.co>

Files changed (5) hide show

README.md +117 -1
images/mask.png +0 -0
images/original.png +0 -0
images/output.png +0 -0
sd.png +0 -0

README.md CHANGED Viewed

@@ -10,4 +10,120 @@ duplicated_from: ControlNet-1-1-preview/control_v11p_sd15_inpaint
 # Controlnet - v1.1 - *InPaint Version*
-TODO

 # Controlnet - v1.1 - *InPaint Version*
+**Controlnet v1.1** is the successor model of [Controlnet v1.0](https://huggingface.co/lllyasviel/ControlNet)
+and was released in [lllyasviel/ControlNet-v1-1](https://huggingface.co/lllyasviel/ControlNet-v1-1) by [Lvmin Zhang](https://huggingface.co/lllyasviel).
+This checkpoint is a conversion of [the original checkpoint](https://huggingface.co/lllyasviel/ControlNet-v1-1/blob/main/control_v11p_sd15_inpaint.pth) into `diffusers` format.
+It can be used in combination with **Stable Diffusion**, such as [runwayml/stable-diffusion-v1-5](https://huggingface.co/runwayml/stable-diffusion-v1-5).
+For more details, please also have a look at the [🧨 Diffusers docs](https://huggingface.co/docs/diffusers/api/pipelines/stable_diffusion/controlnet).
+ControlNet is a neural network structure to control diffusion models by adding extra conditions.
+![img](./sd.png)
+This checkpoint corresponds to the ControlNet conditioned on **inpaint images**.
+## Model Details
+- **Developed by:** Lvmin Zhang, Maneesh Agrawala
+- **Model type:** Diffusion-based text-to-image generation model
+- **Language(s):** English
+- **License:** [The CreativeML OpenRAIL M license](https://huggingface.co/spaces/CompVis/stable-diffusion-license) is an [Open RAIL M license](https://www.licenses.ai/blog/2022/8/18/naming-convention-of-responsible-ai-licenses), adapted from the work that [BigScience](https://bigscience.huggingface.co/) and [the RAIL Initiative](https://www.licenses.ai/) are jointly carrying in the area of responsible AI licensing. See also [the article about the BLOOM Open RAIL license](https://bigscience.huggingface.co/blog/the-bigscience-rail-license) on which our license is based.
+- **Resources for more information:** [GitHub Repository](https://github.com/lllyasviel/ControlNet), [Paper](https://arxiv.org/abs/2302.05543).
+- **Cite as:**
+  @misc{zhang2023adding,
+    title={Adding Conditional Control to Text-to-Image Diffusion Models},
+    author={Lvmin Zhang and Maneesh Agrawala},
+    year={2023},
+    eprint={2302.05543},
+    archivePrefix={arXiv},
+    primaryClass={cs.CV}
+  }
+## Introduction
+Controlnet was proposed in [*Adding Conditional Control to Text-to-Image Diffusion Models*](https://arxiv.org/abs/2302.05543) by
+Lvmin Zhang, Maneesh Agrawala.
+The abstract reads as follows:
+*We present a neural network structure, ControlNet, to control pretrained large diffusion models to support additional input conditions.
+The ControlNet learns task-specific conditions in an end-to-end way, and the learning is robust even when the training dataset is small (< 50k).
+Moreover, training a ControlNet is as fast as fine-tuning a diffusion model, and the model can be trained on a personal devices.
+Alternatively, if powerful computation clusters are available, the model can scale to large amounts (millions to billions) of data.
+We report that large diffusion models like Stable Diffusion can be augmented with ControlNets to enable conditional inputs like edge maps, segmentation maps, keypoints, etc.
+This may enrich the methods to control large diffusion models and further facilitate related applications.*
+## Example
+It is recommended to use the checkpoint with [Stable Diffusion v1-5](https://huggingface.co/runwayml/stable-diffusion-v1-5) as the checkpoint
+has been trained on it.
+Experimentally, the checkpoint can be used with other diffusion models such as dreamboothed stable diffusion.
+**Note**: If you want to process an image to create the auxiliary conditioning, external dependencies are required as shown below:
+1. Install https://github.com/patrickvonplaten/controlnet_aux
+```sh
+$ pip install controlnet_aux==0.3.0
+```
+2. Let's install `diffusers` and related packages:
+```
+$ pip install diffusers transformers accelerate
+```
+3. Run code:
+```python
+import torch
+import os
+from diffusers.utils import load_image
+from PIL import Image
+import numpy as np
+from diffusers import (
+    ControlNetModel,
+    StableDiffusionControlNetPipeline,
+    UniPCMultistepScheduler,
+)
+checkpoint = "lllyasviel/control_v11p_sd15_inpaint"
+original_image = load_image(
+    "https://huggingface.co/lllyasviel/control_v11p_sd15_inpaint/resolve/main/images/original.png"
+)
+mask_image = load_image(
+    "https://huggingface.co/lllyasviel/control_v11p_sd15_inpaint/resolve/main/images/mask.png"
+)
+def make_inpaint_condition(image, image_mask):
+    image = np.array(image.convert("RGB")).astype(np.float32) / 255.0
+    image_mask = np.array(image_mask.convert("L"))
+    assert image.shape[0:1] == image_mask.shape[0:1], "image and image_mask must have the same image size"
+    image[image_mask < 128] = -1.0 # set as masked pixel
+    image = np.expand_dims(image, 0).transpose(0, 3, 1, 2)
+    image = torch.from_numpy(image)
+    return image
+control_image = make_inpaint_condition(original_image, mask_image)
+prompt = "best quality"
+negative_prompt="lowres, bad anatomy, bad hands, cropped, worst quality"
+controlnet = ControlNetModel.from_pretrained(checkpoint, torch_dtype=torch.float16)
+pipe = StableDiffusionControlNetPipeline.from_pretrained(
+    "runwayml/stable-diffusion-v1-5", controlnet=controlnet, torch_dtype=torch.float16
+)
+pipe.scheduler = UniPCMultistepScheduler.from_config(pipe.scheduler.config)
+pipe.enable_model_cpu_offload()
+generator = torch.manual_seed(2)
+image = pipe(prompt, negative_prompt=negative_prompt, num_inference_steps=30,
+             generator=generator, image=control_image).images[0]
+image.save('images/output.png')
+```
+![original](./images/original.png)
+![mask](./images/mask.png)
+![inpaint_output](./images/inpaint_output.png)
+## Other released checkpoints v1-1
+The authors released 14 different checkpoints, each trained with [Stable Diffusion v1-5](https://huggingface.co/runwayml/stable-diffusion-v1-5)
+on a different type of conditioning:
+| Model Name | Control Image Overview| Control Image Example | Generated Image Example |
+|---|---|---|---|
+TODO
+### Training
+TODO
+### Blog post
+For more information, please also have a look at the [Diffusers ControlNet Blog Post](https://huggingface.co/blog/controlnet).

images/mask.png ADDED Viewed

images/original.png ADDED Viewed

images/output.png ADDED Viewed

sd.png ADDED Viewed