Transformers
English
controlnet
Inference Endpoints
Gerold Meisinger commited on
Commit
d3bbaf6
1 Parent(s): 0bb179b

control-edgedrawing-cv480edpf-fastdup-fp16-checkpoint-45000

Browse files
README.md CHANGED
@@ -8,18 +8,20 @@ tags:
8
  - controlnet
9
  ---
10
 
11
- Controls image generation by edge maps generated with [Edge Drawing](https://github.com/CihanTopal/ED_Lib). Edge Drawing comes in different flavors: original (_ed_), parameter free (_edpf_), color (_edcolor_).
12
 
13
  * Based on my monologs at [github.com - Edge Drawing](https://github.com/lllyasviel/ControlNet/discussions/318)
14
  * For usage see the model page on [civitai.com - Model](https://civitai.com/models/149740).
15
- * To generate edpf maps you can use the [space](https://huggingface.co/spaces/GeroldMeisinger/edpf) or script from [gitlab.com - edpf.py](https://gitlab.com/-/snippets/3601881).
16
- * For evaluation see the corresponding .zip with images in "files".
17
- * To run your own evaluations you can use the script [gitlab.com - inference.py](https://gitlab.com/-/snippets/3602096).
18
 
19
  **Edge Drawing Parameter Free**
20
 
21
  ![image/png](https://cdn-uploads.huggingface.co/production/uploads/64c0ec65a2ec8cb2f589233a/jmdCGeMJx4dKFGo44cuEq.png)
22
 
 
 
23
  **Example**
24
 
25
  sampler=UniPC steps=20 cfg=7.5 seed=0 batch=9 model: v1-5-pruned-emaonly.safetensors cherry-picked: 1/9
@@ -28,8 +30,6 @@ prompt: _a detailed high-quality professional photo of swedish woman standing in
28
 
29
  ![image/png](https://cdn-uploads.huggingface.co/production/uploads/64c0ec65a2ec8cb2f589233a/2PSWsmzLdHeVG-i67S7jF.png)
30
 
31
- _Clear and pristine! Wooow!_
32
-
33
  **Canndy Edge for comparison (default in Automatic1111)**
34
 
35
  ![image/png](https://cdn-uploads.huggingface.co/production/uploads/64c0ec65a2ec8cb2f589233a/JZTpa-HZfw0NUYnxZ52Iu.png)
@@ -66,10 +66,10 @@ accelerate launch train_controlnet.py ^
66
  To evaluate the model it makes sense to compare it with the original Canny model. Original evaluations and comparisons are available at [ControlNet 1.0 repo](https://github.com/lllyasviel/ControlNet), [ControlNet 1.1 repo](https://github.com/lllyasviel/ControlNet-v1-1-nightly), [ControlNet paper v1](https://arxiv.org/abs/2302.05543v1), [ControlNet paper v2](https://arxiv.org/abs/2302.05543) and [Diffusers implementation](https://huggingface.co/takuma104/controlnet_dev/tree/main). Some points we have to keep in mind when comparing canny with edpf in order not to compare apples with oranges:
67
  * canny 1.0 model was trained on 3M images with fp32, canny 1.1 model on even more, while edpf model so far is only trained on a 180k-360k with fp16.
68
  * canny edge-detector requires parameter tuning while edpf is parameter free.
69
- * Do we manually fine-tune canny to find the perfect input image or do we leave it at default? We could argue that "no fine-tuning required" is the usp of edpf and we want to compare in the default setting, whereas canny fine-tuning is subjective.
70
- * Would the canny model actually benefit from a edpf pre-processor and we might not even require a edpf model? (2023-09-25: see `eval_canny_edpf.zip` but it seems as if it doesn't work and the edpf model may be justified)
71
  * When evaluating human images we need to be aware of Stable Diffusion's inherent limits, like disformed faces and hands, and don't attribute them to the control net.
72
- * When evaluating style we need to be aware of the bias from the image dataset (`laion2b-en-aesthetics65`), which might tend to generate "aesthetic" images, and not actually work "intrisically better".
73
 
74
  # Versions
75
 
@@ -129,16 +129,50 @@ see experiment 3.0. restarted from 0 with `--proportion_empty_prompts=0.5` => re
129
 
130
  **Experiment 4.1 - 2023-09-26 - control-edgedrawing-cv480edpf-drop50-fp16-checkpoint-90000**
131
 
132
- resumed from 45000 steps with left-right flipped images => results are still not good, 50% is probably too much for 45k steps. guessmode still doesn't work and tends to produces humans. abort.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
133
 
134
  # Ideas
135
 
136
- * fine-tune from canny
137
- * cleanup image dataset (l65)
138
- * uncropped mod64 images
139
  * integrate edcolor
140
- * bigger image dataset (gcc)
141
- * cleanup image dataset (gcc)
 
142
  * re-train with fp32
143
 
144
  # Question and answers
 
8
  - controlnet
9
  ---
10
 
11
+ Controls image generation by edge maps generated with [Edge Drawing](https://github.com/CihanTopal/ED_Lib). Note that Edge Drawing comes in different flavors: original (_ed_), parameter free (_edpf_), color (_edcolor_).
12
 
13
  * Based on my monologs at [github.com - Edge Drawing](https://github.com/lllyasviel/ControlNet/discussions/318)
14
  * For usage see the model page on [civitai.com - Model](https://civitai.com/models/149740).
15
+ * To generate edpf maps you can use [this space](https://huggingface.co/spaces/GeroldMeisinger/edpf) or [this script gitlab.com](https://gitlab.com/-/snippets/3601881).
16
+ * For evaluation see the corresponding .zip with images at the files.
17
+ * To run your own evaluations you can use [this script at gitlab.com](https://gitlab.com/-/snippets/3602096).
18
 
19
  **Edge Drawing Parameter Free**
20
 
21
  ![image/png](https://cdn-uploads.huggingface.co/production/uploads/64c0ec65a2ec8cb2f589233a/jmdCGeMJx4dKFGo44cuEq.png)
22
 
23
+ _Clear and pristine! Wooow!_
24
+
25
  **Example**
26
 
27
  sampler=UniPC steps=20 cfg=7.5 seed=0 batch=9 model: v1-5-pruned-emaonly.safetensors cherry-picked: 1/9
 
30
 
31
  ![image/png](https://cdn-uploads.huggingface.co/production/uploads/64c0ec65a2ec8cb2f589233a/2PSWsmzLdHeVG-i67S7jF.png)
32
 
 
 
33
  **Canndy Edge for comparison (default in Automatic1111)**
34
 
35
  ![image/png](https://cdn-uploads.huggingface.co/production/uploads/64c0ec65a2ec8cb2f589233a/JZTpa-HZfw0NUYnxZ52Iu.png)
 
66
  To evaluate the model it makes sense to compare it with the original Canny model. Original evaluations and comparisons are available at [ControlNet 1.0 repo](https://github.com/lllyasviel/ControlNet), [ControlNet 1.1 repo](https://github.com/lllyasviel/ControlNet-v1-1-nightly), [ControlNet paper v1](https://arxiv.org/abs/2302.05543v1), [ControlNet paper v2](https://arxiv.org/abs/2302.05543) and [Diffusers implementation](https://huggingface.co/takuma104/controlnet_dev/tree/main). Some points we have to keep in mind when comparing canny with edpf in order not to compare apples with oranges:
67
  * canny 1.0 model was trained on 3M images with fp32, canny 1.1 model on even more, while edpf model so far is only trained on a 180k-360k with fp16.
68
  * canny edge-detector requires parameter tuning while edpf is parameter free.
69
+ * Should we manually fine-tune canny to find the perfect input image or do we leave it at default? We could argue that "no fine-tuning required" is the usp of edpf and we want to compare in the default setting, whereas canny fine-tuning is subjective.
70
+ * Would the canny model actually benefit from a edpf pre-processor and we might not even require a specialized edpf model? (2023-09-25: see `eval_canny_edpf.zip` but it seems as if it doesn't work and the edpf model may be justified)
71
  * When evaluating human images we need to be aware of Stable Diffusion's inherent limits, like disformed faces and hands, and don't attribute them to the control net.
72
+ * When evaluating style we need to be aware of the bias from the image dataset (`laion2b-en-aesthetics65`), which might tend to generating "aesthetic" images, and not actually work "intrisicly better".
73
 
74
  # Versions
75
 
 
129
 
130
  **Experiment 4.1 - 2023-09-26 - control-edgedrawing-cv480edpf-drop50-fp16-checkpoint-90000**
131
 
132
+ resumed from 45000 steps with left-right flipped images until 90000 steps => results are still not good, 50% is probably also too much for 90k steps. guessmode still doesn't work and tends to produces humans. aborting.
133
+
134
+ ** Experiment 5.0 - 2023-09-28 - control-edgedrawing-cv480edpf-fastdup-fp16-checkpoint-45000 **
135
+
136
+ see experiment 3. cleaned original images following the [fastdup introduction](https://nbviewer.org/github/visual-layer/fastdup/blob/main/examples/cleaning-image-dataset.ipynb) resulting in:
137
+ ```
138
+ 180210 images in total
139
+ 67854 duplicates
140
+ 644 outliers
141
+ 26 too dark
142
+ 321 too bright
143
+ 57 blurry
144
+ 68621 unique removed (that's 38%!)
145
+ ```
146
+
147
+ restarted from 0 with left-right flipped images and `--mixed-precision="no"` to create a master release and converted to fp16 afterwards.
148
+
149
+ ** Experiment 6.0 - control-edgedrawing-cv480edpf-rect-fp16-checkpoint-XXXXX **
150
+
151
+ see experiment 5.0.
152
+ * included images with aspect ratio > 2
153
+ * resized images with shortside to 512 which gives us rectangular images instead of 512x512 squares
154
+ * center-cropped images to 512x(n)*64 (to make them SD compatible) and max longside 1024
155
+ * sorted duplicates by `similarity` value from `laion2b-en-aesthetics65` to get the best `text` of all duplicates
156
+
157
+ ```
158
+ 183410 images in total
159
+ 75686 duplicates
160
+ 381 outliers
161
+ 50 too dark
162
+ 436 too bright
163
+ 31 blurry
164
+ 76288 unique removed (that's 42%!)
165
+ ```
166
+
167
+ restarted from 0 and `--mixed-precision="fp16"`.
168
 
169
  # Ideas
170
 
171
+ * make conceptual captions for laion
 
 
172
  * integrate edcolor
173
+ * try to fine-tune from canny
174
+ * image dataset with better captions (cc3m)
175
+ * remove images by semantic (use only photos, paintings etc. for edge detection)
176
  * re-train with fp32
177
 
178
  # Question and answers
control-edgedrawing-cv480edpf-fastdup-fp16-checkpoint-45000.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ea1d8bf1f7e7b5dbb501aeb3c294cf60120c6b56575c16173f43ffacc68f4a8d
3
+ size 722598616