|
# Diffusers ControlNet Impl. & Reference Impl. generated image comparison |
|
|
|
## Implementation Source Code & versions |
|
- Diffusers (in development version): https://github.com/takuma104/diffusers/tree/e758682c00a7d23e271fd8c9fb7a48912838045c |
|
- Reference Impl.: https://github.com/lllyasviel/ControlNet/tree/ed0439b37ce0872789e425d46381e892dbc9b407 |
|
|
|
## Environment |
|
- OS: Ubuntu 22.04 |
|
- GPU: Nvidia RTX3090 |
|
- Cuda: 11.7 |
|
- Pytorch: 1.13.1 |
|
|
|
## Scripts to generate plots: |
|
- Create control image: [create_control_images.py](create_control_images.py) |
|
- Diffusers generated image: [gen_diffusers_image.py](gen_diffusers_image.py) |
|
- Reference generated image: [gen_reference_image.py](gen_reference_image.py) |
|
- Create Plots: [create_plots.py](create_plots.py) |
|
|
|
## Original image for control image: |
|
- All images from [test_imgs](https://github.com/lllyasviel/ControlNet/tree/main/test_imgs) excepts vermeer image. Croped and resized to 512x512px. |
|
|
|
|
|
<img width="128" src="https://huggingface.co/takuma104/controlnet_dev/resolve/main/gen_compare/control_images/bird_512x512.png" /> |
|
<img width="128" src="https://huggingface.co/takuma104/controlnet_dev/resolve/main/gen_compare/control_images/human_512x512.png" /> |
|
<img width="128" src="https://huggingface.co/takuma104/controlnet_dev/resolve/main/gen_compare/control_images/room_512x512.png" /> |
|
<img width="128" src="https://huggingface.co/takuma104/controlnet_dev/resolve/main/gen_compare/control_images/vermeer_512x512.png" /> |
|
|
|
## Generate Settings: |
|
#### Control Images: |
|
Converted above original images by [controlnet_hinter](https://github.com/takuma104/controlnet_hinter). |
|
|
|
#### Prompts: |
|
All images were generated with the same prompt. |
|
|
|
- Prompt: `best quality, extremely detailed` |
|
- Negative Prompt: `lowres, bad anatomy, worst quality, low quality` |
|
|
|
#### Ohter setting (both common): |
|
- sampler: DDIM |
|
- guidance_scale: 9.0 |
|
- num_inference_steps: 20 |
|
- initial random latents: created on CPU using the seed |
|
|
|
## Results: |
|
|
|
<img width="720" src="https://huggingface.co/takuma104/controlnet_dev/resolve/main/gen_compare/plots/figure_canny.png" /> |
|
<img width="720" src="https://huggingface.co/takuma104/controlnet_dev/resolve/main/gen_compare/plots/figure_depth.png" /> |
|
<img width="720" src="https://huggingface.co/takuma104/controlnet_dev/resolve/main/gen_compare/plots/figure_hed.png" /> |
|
<img width="720" src="https://huggingface.co/takuma104/controlnet_dev/resolve/main/gen_compare/plots/figure_mlsd.png" /> |
|
<img width="720" src="https://huggingface.co/takuma104/controlnet_dev/resolve/main/gen_compare/plots/figure_normal.png" /> |
|
<img width="720" src="https://huggingface.co/takuma104/controlnet_dev/resolve/main/gen_compare/plots/figure_openpose.png" /> |
|
<img width="720" src="https://huggingface.co/takuma104/controlnet_dev/resolve/main/gen_compare/plots/figure_scribble.png" /> |
|
<img width="720" src="https://huggingface.co/takuma104/controlnet_dev/resolve/main/gen_compare/plots/figure_seg.png" /> |
|
|
|
<!-- [![canny](plots/figure_canny.png)](plots/figure_canny.png) |
|
[![depth](plots/figure_depth.png)](plots/figure_depth.png) |
|
[![hed](plots/figure_hed.png)](plots/figure_hed.png) |
|
[![mlsd](plots/figure_mlsd.png)](plots/figure_mlsd.png) |
|
[![normal](plots/figure_normal.png)](plots/figure_normal.png) |
|
[![openpose](plots/figure_openpose.png)](plots/figure_openpose.png) |
|
[![scribble](plots/figure_scribble.png)](plots/figure_scribble.png) |
|
[![seg](plots/figure_seg.png)](plots/figure_seg.png) --> |
|
|
|
|
|
|
|
|