|
--- |
|
license: openrail |
|
base_model: runwayml/stable-diffusion-v1-5 |
|
tags: |
|
- art |
|
- controlnet |
|
- stable-diffusion |
|
--- |
|
|
|
# Controlnet |
|
|
|
Controlnet is an auxiliary model which augments pre-trained diffusion models with an additional conditioning. |
|
|
|
Controlnet comes with multiple auxiliary models, each which allows a different type of conditioning |
|
|
|
Controlnet's auxiliary models are trained with stable diffusion 1.5. Experimentally, the auxiliary models can be used with other diffusion models such as dreamboothed stable diffusion. |
|
|
|
The auxiliary conditioning is passed directly to the diffusers pipeline. If you want to process an image to create the auxiliary conditioning, external dependencies are required. |
|
|
|
Some of the additional conditionings can be extracted from images via additional models. We extracted these |
|
additional models from the original controlnet repo into a separate package that can be found on [github](https://github.com/patrickvonplaten/human_pose.git). |
|
|
|
## Canny edge detection |
|
|
|
Install opencv |
|
|
|
```sh |
|
$ pip install opencv-contrib-python |
|
``` |
|
|
|
```python |
|
import cv2 |
|
from PIL import Image |
|
from diffusers import StableDiffusionControlNetPipeline, ControlNetModel |
|
import torch |
|
import numpy as np |
|
|
|
image = Image.open('images/bird.png') |
|
image = np.array(image) |
|
|
|
low_threshold = 100 |
|
high_threshold = 200 |
|
|
|
image = cv2.Canny(image, low_threshold, high_threshold) |
|
image = image[:, :, None] |
|
image = np.concatenate([image, image, image], axis=2) |
|
image = Image.fromarray(image) |
|
|
|
controlnet = ControlNetModel.from_pretrained( |
|
"fusing/stable-diffusion-v1-5-controlnet-canny", |
|
) |
|
|
|
pipe = StableDiffusionControlNetPipeline.from_pretrained( |
|
"runwayml/stable-diffusion-v1-5", controlnet=controlnet, safety_checker=None |
|
) |
|
pipe.to('cuda') |
|
|
|
image = pipe("bird", image).images[0] |
|
|
|
image.save('images/bird_canny_out.png') |
|
``` |
|
|
|
![bird](./images/bird.png) |
|
|
|
![bird_canny](./images/bird_canny.png) |
|
|
|
![bird_canny_out](./images/bird_canny_out.png) |
|
|
|
## M-LSD Straight line detection |
|
|
|
Install the additional controlnet models package. |
|
|
|
```sh |
|
$ pip install git+https://github.com/patrickvonplaten/human_pose.git |
|
``` |
|
|
|
```py |
|
from PIL import Image |
|
from diffusers import StableDiffusionControlNetPipeline, ControlNetModel |
|
import torch |
|
from human_pose import MLSDdetector |
|
|
|
mlsd = MLSDdetector.from_pretrained('lllyasviel/ControlNet') |
|
|
|
image = Image.open('images/room.png') |
|
|
|
image = mlsd(image) |
|
|
|
controlnet = ControlNetModel.from_pretrained( |
|
"fusing/stable-diffusion-v1-5-controlnet-mlsd", |
|
) |
|
|
|
pipe = StableDiffusionControlNetPipeline.from_pretrained( |
|
"runwayml/stable-diffusion-v1-5", controlnet=controlnet, safety_checker=None |
|
) |
|
pipe.to('cuda') |
|
|
|
image = pipe("room", image).images[0] |
|
|
|
image.save('images/room_mlsd_out.png') |
|
``` |
|
|
|
![room](./images/room.png) |
|
|
|
![room_mlsd](./images/room_mlsd.png) |
|
|
|
![room_mlsd_out](./images/room_mlsd_out.png) |
|
|
|
## Pose estimation |
|
|
|
Install the additional controlnet models package. |
|
|
|
```sh |
|
$ pip install git+https://github.com/patrickvonplaten/human_pose.git |
|
``` |
|
|
|
```py |
|
from PIL import Image |
|
from diffusers import StableDiffusionControlNetPipeline, ControlNetModel |
|
import torch |
|
from human_pose import OpenposeDetector |
|
|
|
openpose = OpenposeDetector.from_pretrained('lllyasviel/ControlNet') |
|
|
|
image = Image.open('images/pose.png') |
|
|
|
image = openpose(image) |
|
|
|
controlnet = ControlNetModel.from_pretrained( |
|
"fusing/stable-diffusion-v1-5-controlnet-openpose", |
|
) |
|
|
|
pipe = StableDiffusionControlNetPipeline.from_pretrained( |
|
"runwayml/stable-diffusion-v1-5", controlnet=controlnet, safety_checker=None |
|
) |
|
pipe.to('cuda') |
|
|
|
image = pipe("chef in the kitchen", image).images[0] |
|
|
|
image.save('images/chef_pose_out.png') |
|
``` |
|
|
|
![pose](./images/pose.png) |
|
|
|
![openpose](./images/openpose.png) |
|
|
|
![chef_pose_out](./images/chef_pose_out.png) |
|
|
|
## Semantic Segmentation |
|
|
|
Semantic segmentation relies on transformers. Transformers is a |
|
dependency of diffusers for running controlnet, so you should |
|
have it installed already. |
|
|
|
```py |
|
from transformers import AutoImageProcessor, UperNetForSemanticSegmentation |
|
from PIL import Image |
|
import numpy as np |
|
from controlnet_utils import ade_palette |
|
import torch |
|
from diffusers import StableDiffusionControlNetPipeline, ControlNetModel |
|
|
|
image_processor = AutoImageProcessor.from_pretrained("openmmlab/upernet-convnext-small") |
|
image_segmentor = UperNetForSemanticSegmentation.from_pretrained("openmmlab/upernet-convnext-small") |
|
|
|
image = Image.open("./images/house.png").convert('RGB') |
|
|
|
pixel_values = image_processor(image, return_tensors="pt").pixel_values |
|
|
|
with torch.no_grad(): |
|
outputs = image_segmentor(pixel_values) |
|
|
|
seg = image_processor.post_process_semantic_segmentation(outputs, target_sizes=[image.size[::-1]])[0] |
|
|
|
color_seg = np.zeros((seg.shape[0], seg.shape[1], 3), dtype=np.uint8) # height, width, 3 |
|
|
|
palette = np.array(ade_palette()) |
|
|
|
for label, color in enumerate(palette): |
|
color_seg[seg == label, :] = color |
|
|
|
color_seg = color_seg.astype(np.uint8) |
|
|
|
image = Image.fromarray(color_seg) |
|
|
|
controlnet = ControlNetModel.from_pretrained( |
|
"fusing/stable-diffusion-v1-5-controlnet-seg", |
|
) |
|
|
|
pipe = StableDiffusionControlNetPipeline.from_pretrained( |
|
"runwayml/stable-diffusion-v1-5", controlnet=controlnet, safety_checker=None |
|
) |
|
pipe.to('cuda') |
|
|
|
image = pipe("house", image).images[0] |
|
|
|
image.save('./images/house_seg_out.png') |
|
``` |
|
|
|
![house](images/house.png) |
|
|
|
![house_seg](images/house_seg.png) |
|
|
|
![house_seg_out](images/house_seg_out.png) |
|
|
|
## Depth control |
|
|
|
Depth control relies on transformers. Transformers is a dependency of diffusers for running controlnet, so |
|
you should have it installed already. |
|
|
|
```py |
|
from transformers import pipeline |
|
from diffusers import StableDiffusionControlNetPipeline, ControlNetModel |
|
from PIL import Image |
|
import numpy as np |
|
|
|
depth_estimator = pipeline('depth-estimation') |
|
|
|
image = Image.open('./images/stormtrooper.png') |
|
image = depth_estimator(image)['depth'] |
|
image = np.array(image) |
|
image = image[:, :, None] |
|
image = np.concatenate([image, image, image], axis=2) |
|
image = Image.fromarray(image) |
|
|
|
controlnet = ControlNetModel.from_pretrained( |
|
"fusing/stable-diffusion-v1-5-controlnet-depth", |
|
) |
|
|
|
pipe = StableDiffusionControlNetPipeline.from_pretrained( |
|
"runwayml/stable-diffusion-v1-5", controlnet=controlnet, safety_checker=None |
|
) |
|
pipe.to('cuda') |
|
|
|
image = pipe("Stormtrooper's lecture", image).images[0] |
|
|
|
image.save('./images/stormtrooper_depth_out.png') |
|
``` |
|
|
|
![stormtrooper](./images/stormtrooper.png) |
|
|
|
![stormtrooler_depth](./images/stormtrooper_depth.png) |
|
|
|
![stormtrooler_depth_out](./images/stormtrooper_depth_out.png) |
|
|
|
|
|
## Normal map |
|
|
|
```py |
|
from PIL import Image |
|
from transformers import pipeline |
|
import numpy as np |
|
import cv2 |
|
from diffusers import StableDiffusionControlNetPipeline, ControlNetModel |
|
|
|
image = Image.open("images/toy.png").convert("RGB") |
|
|
|
depth_estimator = pipeline("depth-estimation", model ="Intel/dpt-hybrid-midas" ) |
|
|
|
image = depth_estimator(image)['predicted_depth'][0] |
|
|
|
image = image.numpy() |
|
|
|
image_depth = image.copy() |
|
image_depth -= np.min(image_depth) |
|
image_depth /= np.max(image_depth) |
|
|
|
bg_threhold = 0.4 |
|
|
|
x = cv2.Sobel(image, cv2.CV_32F, 1, 0, ksize=3) |
|
x[image_depth < bg_threhold] = 0 |
|
|
|
y = cv2.Sobel(image, cv2.CV_32F, 0, 1, ksize=3) |
|
y[image_depth < bg_threhold] = 0 |
|
|
|
z = np.ones_like(x) * np.pi * 2.0 |
|
|
|
image = np.stack([x, y, z], axis=2) |
|
image /= np.sum(image ** 2.0, axis=2, keepdims=True) ** 0.5 |
|
image = (image * 127.5 + 127.5).clip(0, 255).astype(np.uint8) |
|
image = Image.fromarray(image) |
|
|
|
controlnet = ControlNetModel.from_pretrained( |
|
"fusing/stable-diffusion-v1-5-controlnet-normal", |
|
) |
|
|
|
pipe = StableDiffusionControlNetPipeline.from_pretrained( |
|
"runwayml/stable-diffusion-v1-5", controlnet=controlnet, safety_checker=None |
|
) |
|
pipe.to('cuda') |
|
|
|
image = pipe("cute toy", image).images[0] |
|
|
|
image.save('images/toy_normal_out.png') |
|
``` |
|
|
|
![toy](./images/toy.png) |
|
|
|
![toy_normal](./images/toy_normal.png) |
|
|
|
![toy_normal_out](./images/toy_normal_out.png) |
|
|
|
## Scribble |
|
|
|
Install the additional controlnet models package. |
|
|
|
```sh |
|
$ pip install git+https://github.com/patrickvonplaten/human_pose.git |
|
``` |
|
|
|
```py |
|
from PIL import Image |
|
from diffusers import StableDiffusionControlNetPipeline, ControlNetModel |
|
import torch |
|
from human_pose import HEDdetector |
|
|
|
hed = HEDdetector.from_pretrained('lllyasviel/ControlNet') |
|
|
|
image = Image.open('images/bag.png') |
|
|
|
image = hed(image, scribble=True) |
|
|
|
controlnet = ControlNetModel.from_pretrained( |
|
"fusing/stable-diffusion-v1-5-controlnet-scribble", |
|
) |
|
|
|
pipe = StableDiffusionControlNetPipeline.from_pretrained( |
|
"runwayml/stable-diffusion-v1-5", controlnet=controlnet, safety_checker=None |
|
) |
|
pipe.to('cuda') |
|
|
|
image = pipe("bag", image).images[0] |
|
|
|
image.save('images/bag_scribble_out.png') |
|
``` |
|
|
|
![bag](./images/bag.png) |
|
|
|
![bag_scribble](./images/bag_scribble.png) |
|
|
|
![bag_scribble_out](./images/bag_scribble_out.png) |
|
|
|
## HED Boundary |
|
|
|
Install the additional controlnet models package. |
|
|
|
```sh |
|
$ pip install git+https://github.com/patrickvonplaten/human_pose.git |
|
``` |
|
|
|
```py |
|
from PIL import Image |
|
from diffusers import StableDiffusionControlNetPipeline, ControlNetModel |
|
import torch |
|
from human_pose import HEDdetector |
|
|
|
hed = HEDdetector.from_pretrained('lllyasviel/ControlNet') |
|
|
|
image = Image.open('images/man.png') |
|
|
|
image = hed(image) |
|
|
|
controlnet = ControlNetModel.from_pretrained( |
|
"fusing/stable-diffusion-v1-5-controlnet-hed", |
|
) |
|
|
|
pipe = StableDiffusionControlNetPipeline.from_pretrained( |
|
"runwayml/stable-diffusion-v1-5", controlnet=controlnet, safety_checker=None |
|
) |
|
pipe.to('cuda') |
|
|
|
image = pipe("oil painting of handsome old man, masterpiece", image).images[0] |
|
|
|
image.save('images/man_hed_out.png') |
|
``` |
|
|
|
![man](./images/man.png) |
|
|
|
![man_hed](./images/man_hed.png) |
|
|
|
![man_hed_out](./images/man_hed_out.png) |
|
|