File size: 3,437 Bytes
3de264f
 
 
 
012d337
3de264f
 
 
012d337
8f04768
 
3de264f
 
 
 
 
 
 
 
8d33972
3de264f
 
ba14468
 
 
 
3de264f
 
 
 
 
 
 
 
8f04768
 
3de264f
 
 
 
 
012d337
3de264f
 
 
25aa3fe
 
 
 
 
 
 
 
 
 
3de264f
 
 
 
 
 
25aa3fe
3de264f
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
# Diffusers ControlNet Impl. & Reference Impl. generated image comparison

## Implementation Source Code & versions
- Diffusers (in development version): https://github.com/takuma104/diffusers/tree/e758682c00a7d23e271fd8c9fb7a48912838045c
- Reference Impl.: https://github.com/lllyasviel/ControlNet/tree/ed0439b37ce0872789e425d46381e892dbc9b407 

## Environment
- OS: Ubuntu 22.04
- GPU: Nvidia RTX3090
- Cuda: 11.7
- Pytorch: 1.13.1

## Scripts to generate plots:
- Create control image: [create_control_images.py](create_control_images.py)
- Diffusers generated image: [gen_diffusers_image.py](gen_diffusers_image.py)
- Reference generated image: [gen_reference_image.py](gen_reference_image.py)
- Create Plots: [create_plots.py](create_plots.py)

## Original image for control image:
- All images from [test_imgs](https://github.com/lllyasviel/ControlNet/tree/main/test_imgs) excepts vermeer image. Croped and resized to 512x512px.


<img width="128" src="https://huggingface.co/takuma104/controlnet_dev/resolve/main/gen_compare/control_images/bird_512x512.png" />
<img width="128" src="https://huggingface.co/takuma104/controlnet_dev/resolve/main/gen_compare/control_images/human_512x512.png" />
<img width="128" src="https://huggingface.co/takuma104/controlnet_dev/resolve/main/gen_compare/control_images/room_512x512.png" />
<img width="128" src="https://huggingface.co/takuma104/controlnet_dev/resolve/main/gen_compare/control_images/vermeer_512x512.png" />

## Generate Settings:
#### Control Images:
Converted above original images by [controlnet_hinter](https://github.com/takuma104/controlnet_hinter).

#### Prompts:
All images were generated with the same prompt.

- Prompt: `best quality, extremely detailed`
- Negative Prompt: `lowres, bad anatomy, worst quality, low quality`

#### Ohter setting (both common):
- sampler: DDIM
- guidance_scale: 9.0
- num_inference_steps: 20
- initial random latents: created on CPU using the seed

## Results:

<img width="720" src="https://huggingface.co/takuma104/controlnet_dev/resolve/main/gen_compare/plots/figure_canny.png" />
<img width="720" src="https://huggingface.co/takuma104/controlnet_dev/resolve/main/gen_compare/plots/figure_depth.png" />
<img width="720" src="https://huggingface.co/takuma104/controlnet_dev/resolve/main/gen_compare/plots/figure_hed.png" />
<img width="720" src="https://huggingface.co/takuma104/controlnet_dev/resolve/main/gen_compare/plots/figure_mlsd.png" />
<img width="720" src="https://huggingface.co/takuma104/controlnet_dev/resolve/main/gen_compare/plots/figure_normal.png" />
<img width="720" src="https://huggingface.co/takuma104/controlnet_dev/resolve/main/gen_compare/plots/figure_openpose.png" />
<img width="720" src="https://huggingface.co/takuma104/controlnet_dev/resolve/main/gen_compare/plots/figure_scribble.png" />
<img width="720" src="https://huggingface.co/takuma104/controlnet_dev/resolve/main/gen_compare/plots/figure_seg.png" />

<!-- [![canny](plots/figure_canny.png)](plots/figure_canny.png) 
[![depth](plots/figure_depth.png)](plots/figure_depth.png) 
[![hed](plots/figure_hed.png)](plots/figure_hed.png) 
[![mlsd](plots/figure_mlsd.png)](plots/figure_mlsd.png) 
[![normal](plots/figure_normal.png)](plots/figure_normal.png) 
[![openpose](plots/figure_openpose.png)](plots/figure_openpose.png) 
[![scribble](plots/figure_scribble.png)](plots/figure_scribble.png) 
[![seg](plots/figure_seg.png)](plots/figure_seg.png)  -->