Transformers
English
controlnet
Inference Endpoints
Gerold Meisinger commited on
Commit
61baa04
1 Parent(s): a141c2f
Files changed (1) hide show
  1. README.md +9 -9
README.md CHANGED
@@ -6,7 +6,7 @@ language:
6
  - en
7
  ---
8
 
9
- Controls image generation by edge maps generated with [Edge Drawing](https://github.com/CihanTopal/ED_Lib). Edge Drawing comes in different flavors: original (experiments 1-2), parameter-free (experiments 3+), color (not yet available).
10
 
11
  * Based on my monologs at [github.com - Edge Drawing](https://github.com/lllyasviel/ControlNet/discussions/318)
12
  * For usage see the model page on [civitai.com - Model](https://civitai.com/models/149740).
@@ -60,12 +60,12 @@ accelerate launch train_controlnet.py ^
60
  # Evaluation
61
 
62
  To evaluate the model it makes sense to compare it with the original Canny model. Original evaluations and comparisons are available at [ControlNet 1.0 repo](https://github.com/lllyasviel/ControlNet), [ControlNet 1.1 repo](https://github.com/lllyasviel/ControlNet-v1-1-nightly), [ControlNet paper v1](https://arxiv.org/abs/2302.05543v1), [ControlNet paper v2](https://arxiv.org/abs/2302.05543) and [Diffusers implementation](https://huggingface.co/takuma104/controlnet_dev/tree/main). Some points we have to keep in mind when comparing canny with edpf in order not to compare apples with oranges:
63
- * canny 1.0 model was trained on 3M images, canny 1.1 model on even more, while edpf model so far is only trained on a 180k-360k.
64
  * canny edge-detector requires parameter tuning while edpf is parameter-free.
65
  * Do we manually fine-tune canny to find the perfect input image or do we leave it at default? We could argue that "no fine-tuning required" is the usp of edpf and we want to compare in the default setting, whereas canny fine-tuning is subjective.
66
- * Would the canny model actually benefit from a edpf pre-processor and we might not even require a edpf model?
67
  * When evaluating human images we need to be aware of Stable Diffusion's inherent limits, like disformed faces and hands.
68
- * When evaluating style we need to be aware of the bias from the image dataset (laion2b-en-aesthetics65), which might tend to generate "aesthetic" images, and not actually work "intrisically better".
69
 
70
  # Versions
71
 
@@ -86,7 +86,7 @@ Images converted with https://github.com/shaojunluo/EDLinePython (based on origi
86
 
87
  additional arguments: `--proportion_empty_prompts=0.5`.
88
 
89
- Trained for 40000 steps with default settings => empty prompts were probably too excessive
90
 
91
  Update 2023-09-22: bug in algorithm produces too sparse images on default, see https://github.com/shaojunluo/EDLinePython/issues/4
92
 
@@ -94,7 +94,7 @@ Update 2023-09-22: bug in algorithm produces too sparse images on default, see h
94
 
95
  Same as experiment 1 with `smoothed=True` and `--proportion_empty_prompts=0`.
96
 
97
- Trained for 40000 steps with default settings => conditioning images are too noisy
98
 
99
  **Experiment 3.0 - 2023-09-22 - control-edgedrawing-cv480edpf-drop0-fp16-checkpoint-45000**
100
 
@@ -109,7 +109,7 @@ edges = ed.detectEdges(image)
109
  edge_map = ed.getEdgeImage(edges)
110
  ```
111
 
112
- 45000 steps => This is **version 0.1 on civitai**.
113
 
114
  **Experiment 3.1 - 2023-09-24 - control-edgedrawing-cv480edpf-drop0-fp16-checkpoint-90000**
115
 
@@ -121,13 +121,13 @@ resumed with epoch 2 from 90000 using `--proportion_empty_prompts=0.5` => result
121
 
122
  **Experiment 4.0 - 2023-09-25 - control-edgedrawing-cv480edpf-drop50-fp16-checkpoint-45000**
123
 
124
- see experiment 3.0. restarted from 0 with `--proportion_empty_prompts=0.5` =>
125
 
126
  **Experiment 4.1 - control-edgedrawing-cv480edpf-drop50-fp16-checkpoint-45000**
127
 
128
  # Ideas
129
 
130
- * fine-tune off canny
131
  * cleanup image dataset (l65)
132
  * uncropped mod64 images
133
  * integrate edcolor
 
6
  - en
7
  ---
8
 
9
+ Controls image generation by edge maps generated with [Edge Drawing](https://github.com/CihanTopal/ED_Lib). Edge Drawing comes in different flavors: original (ed), parameter-free (edpf), color (edcolor).
10
 
11
  * Based on my monologs at [github.com - Edge Drawing](https://github.com/lllyasviel/ControlNet/discussions/318)
12
  * For usage see the model page on [civitai.com - Model](https://civitai.com/models/149740).
 
60
  # Evaluation
61
 
62
  To evaluate the model it makes sense to compare it with the original Canny model. Original evaluations and comparisons are available at [ControlNet 1.0 repo](https://github.com/lllyasviel/ControlNet), [ControlNet 1.1 repo](https://github.com/lllyasviel/ControlNet-v1-1-nightly), [ControlNet paper v1](https://arxiv.org/abs/2302.05543v1), [ControlNet paper v2](https://arxiv.org/abs/2302.05543) and [Diffusers implementation](https://huggingface.co/takuma104/controlnet_dev/tree/main). Some points we have to keep in mind when comparing canny with edpf in order not to compare apples with oranges:
63
+ * canny 1.0 model was trained on 3M images with fp32, canny 1.1 model on even more, while edpf model so far is only trained on a 180k-360k with fp16.
64
  * canny edge-detector requires parameter tuning while edpf is parameter-free.
65
  * Do we manually fine-tune canny to find the perfect input image or do we leave it at default? We could argue that "no fine-tuning required" is the usp of edpf and we want to compare in the default setting, whereas canny fine-tuning is subjective.
66
+ * Would the canny model actually benefit from a edpf pre-processor and we might not even require a edpf model? (2023-09-25: see `eval_canny_edpf.zip` but it seems as it doesn't work and the edpf model may be justified)
67
  * When evaluating human images we need to be aware of Stable Diffusion's inherent limits, like disformed faces and hands.
68
+ * When evaluating style we need to be aware of the bias from the image dataset (`laion2b-en-aesthetics65`), which might tend to generate "aesthetic" images, and not actually work "intrisically better".
69
 
70
  # Versions
71
 
 
86
 
87
  additional arguments: `--proportion_empty_prompts=0.5`.
88
 
89
+ Trained for 40000 steps with default settings => results are not good. empty prompts were probably too excessive. retry with no drops and different algorithm parameters.
90
 
91
  Update 2023-09-22: bug in algorithm produces too sparse images on default, see https://github.com/shaojunluo/EDLinePython/issues/4
92
 
 
94
 
95
  Same as experiment 1 with `smoothed=True` and `--proportion_empty_prompts=0`.
96
 
97
+ Trained for 40000 steps with default settings => results are not good. conditioning images look too noisy. investigate algorithm.
98
 
99
  **Experiment 3.0 - 2023-09-22 - control-edgedrawing-cv480edpf-drop0-fp16-checkpoint-45000**
100
 
 
109
  edge_map = ed.getEdgeImage(edges)
110
  ```
111
 
112
+ 45000 steps => looks good. released as **version 0.1 on civitai**.
113
 
114
  **Experiment 3.1 - 2023-09-24 - control-edgedrawing-cv480edpf-drop0-fp16-checkpoint-90000**
115
 
 
121
 
122
  **Experiment 4.0 - 2023-09-25 - control-edgedrawing-cv480edpf-drop50-fp16-checkpoint-45000**
123
 
124
+ see experiment 3.0. restarted from 0 with `--proportion_empty_prompts=0.5` => results are not good, 50% is probably too much for 45k steps. guessmode still doesn't work and tends to produces humans. resuming until 90k with right-left flipped in the hope it will get better with more images.
125
 
126
  **Experiment 4.1 - control-edgedrawing-cv480edpf-drop50-fp16-checkpoint-45000**
127
 
128
  # Ideas
129
 
130
+ * fine-tune from canny
131
  * cleanup image dataset (l65)
132
  * uncropped mod64 images
133
  * integrate edcolor