Gerold Meisinger
commited on
Commit
·
a141c2f
1
Parent(s):
d06ca26
control-edgedrawing-cv480edpf-drop50-fp16-checkpoint-45000
Browse files- README.md +62 -34
- config.json +51 -0
- control-edgedrawing-cv480edpf-drop0-fp16-checkpoint-45000.zip +2 -2
- control-edgedrawing-cv480edpf-drop0-fp16-checkpoint-90000.zip +2 -2
- control-edgedrawing-cv480edpf-drop50-fp16-checkpoint-45000.safetensors +3 -0
- control-edgedrawing-cv480edpf-drop50-fp16-checkpoint-45000.zip +3 -0
- eval_canny_canny.zip +3 -0
- eval_canny_edpf.zip +3 -0
- eval_input.zip +3 -0
README.md
CHANGED
@@ -1,20 +1,20 @@
|
|
1 |
---
|
2 |
-
license:
|
3 |
datasets:
|
4 |
- ChristophSchuhmann/improved_aesthetics_6.5plus
|
5 |
language:
|
6 |
- en
|
7 |
---
|
8 |
|
9 |
-
Controls image generation by edge maps generated with [
|
10 |
|
11 |
-
* Based on my
|
12 |
-
* For usage see the model page on [
|
13 |
-
* To generate edpf maps you can use the script [edpf.py](https://gitlab.com/-/snippets/3601881).
|
14 |
* For evaluation see the corresponding .zip with images in "files".
|
15 |
-
* To run your own evaluations you can use [inference.py](https://gitlab.com/-/snippets/3602096).
|
16 |
|
17 |
-
**
|
18 |
|
19 |
![image/png](https://cdn-uploads.huggingface.co/production/uploads/64c0ec65a2ec8cb2f589233a/jmdCGeMJx4dKFGo44cuEq.png)
|
20 |
|
@@ -26,10 +26,12 @@ prompt: _a detailed high-quality professional photo of swedish woman standing in
|
|
26 |
|
27 |
![image/png](https://cdn-uploads.huggingface.co/production/uploads/64c0ec65a2ec8cb2f589233a/2PSWsmzLdHeVG-i67S7jF.png)
|
28 |
|
29 |
-
**Canndy Edge
|
30 |
|
31 |
![image/png](https://cdn-uploads.huggingface.co/production/uploads/64c0ec65a2ec8cb2f589233a/JZTpa-HZfw0NUYnxZ52Iu.png)
|
32 |
|
|
|
|
|
33 |
# Image dataset
|
34 |
|
35 |
* [laion2B-en aesthetics>=6.5 dataset](https://huggingface.co/datasets/ChristophSchuhmann/improved_aesthetics_6.5plus)
|
@@ -55,13 +57,46 @@ accelerate launch train_controlnet.py ^
|
|
55 |
--seed=0
|
56 |
```
|
57 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
58 |
# Versions
|
59 |
|
60 |
-
**Experiment
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
61 |
|
62 |
-
|
63 |
|
64 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
65 |
|
66 |
Conditioning images generated with [edpf.py](https://gitlab.com/-/snippets/3601881) using [opencv-contrib-python::ximgproc::EdgeDrawing](https://docs.opencv.org/4.8.0/d1/d1c/classcv_1_1ximgproc_1_1EdgeDrawing.html).
|
67 |
|
@@ -74,38 +109,31 @@ edges = ed.detectEdges(image)
|
|
74 |
edge_map = ed.getEdgeImage(edges)
|
75 |
```
|
76 |
|
77 |
-
|
78 |
|
79 |
-
**Experiment 3 - control-edgedrawing-cv480edpf-drop0-fp16-checkpoint-
|
80 |
|
81 |
-
|
82 |
|
83 |
-
**Experiment 2 - control-edgedrawing-
|
84 |
-
|
85 |
-
Images converted with https://github.com/shaojunluo/EDLinePython
|
86 |
-
|
87 |
-
Default settings are:
|
88 |
-
|
89 |
-
`smoothed=False`
|
90 |
|
91 |
-
|
92 |
-
{ 'ksize' : 5
|
93 |
-
, 'sigma' : 1.0
|
94 |
-
, 'gradientThreshold': 36
|
95 |
-
, 'anchorThreshold' : 8
|
96 |
-
, 'scanIntervals' : 1
|
97 |
-
}
|
98 |
-
```
|
99 |
|
100 |
-
|
101 |
|
102 |
-
|
103 |
|
104 |
-
|
105 |
|
106 |
-
|
107 |
|
108 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
109 |
|
110 |
# Question and answers
|
111 |
|
|
|
1 |
---
|
2 |
+
license: cc-by-nc-sa-4.0
|
3 |
datasets:
|
4 |
- ChristophSchuhmann/improved_aesthetics_6.5plus
|
5 |
language:
|
6 |
- en
|
7 |
---
|
8 |
|
9 |
+
Controls image generation by edge maps generated with [Edge Drawing](https://github.com/CihanTopal/ED_Lib). Edge Drawing comes in different flavors: original (experiments 1-2), parameter-free (experiments 3+), color (not yet available).
|
10 |
|
11 |
+
* Based on my monologs at [github.com - Edge Drawing](https://github.com/lllyasviel/ControlNet/discussions/318)
|
12 |
+
* For usage see the model page on [civitai.com - Model](https://civitai.com/models/149740).
|
13 |
+
* To generate edpf maps you can use the script [gitlab.com - edpf.py](https://gitlab.com/-/snippets/3601881).
|
14 |
* For evaluation see the corresponding .zip with images in "files".
|
15 |
+
* To run your own evaluations you can use the script [gitlab.com - inference.py](https://gitlab.com/-/snippets/3602096).
|
16 |
|
17 |
+
**Edge Drawing Parameter Free**
|
18 |
|
19 |
![image/png](https://cdn-uploads.huggingface.co/production/uploads/64c0ec65a2ec8cb2f589233a/jmdCGeMJx4dKFGo44cuEq.png)
|
20 |
|
|
|
26 |
|
27 |
![image/png](https://cdn-uploads.huggingface.co/production/uploads/64c0ec65a2ec8cb2f589233a/2PSWsmzLdHeVG-i67S7jF.png)
|
28 |
|
29 |
+
**Canndy Edge for comparison (default in Automatic1111)**
|
30 |
|
31 |
![image/png](https://cdn-uploads.huggingface.co/production/uploads/64c0ec65a2ec8cb2f589233a/JZTpa-HZfw0NUYnxZ52Iu.png)
|
32 |
|
33 |
+
_notice all the missing edges, the noise and artifacts. yuck! ugh!_
|
34 |
+
|
35 |
# Image dataset
|
36 |
|
37 |
* [laion2B-en aesthetics>=6.5 dataset](https://huggingface.co/datasets/ChristophSchuhmann/improved_aesthetics_6.5plus)
|
|
|
57 |
--seed=0
|
58 |
```
|
59 |
|
60 |
+
# Evaluation
|
61 |
+
|
62 |
+
To evaluate the model it makes sense to compare it with the original Canny model. Original evaluations and comparisons are available at [ControlNet 1.0 repo](https://github.com/lllyasviel/ControlNet), [ControlNet 1.1 repo](https://github.com/lllyasviel/ControlNet-v1-1-nightly), [ControlNet paper v1](https://arxiv.org/abs/2302.05543v1), [ControlNet paper v2](https://arxiv.org/abs/2302.05543) and [Diffusers implementation](https://huggingface.co/takuma104/controlnet_dev/tree/main). Some points we have to keep in mind when comparing canny with edpf in order not to compare apples with oranges:
|
63 |
+
* canny 1.0 model was trained on 3M images, canny 1.1 model on even more, while edpf model so far is only trained on a 180k-360k.
|
64 |
+
* canny edge-detector requires parameter tuning while edpf is parameter-free.
|
65 |
+
* Do we manually fine-tune canny to find the perfect input image or do we leave it at default? We could argue that "no fine-tuning required" is the usp of edpf and we want to compare in the default setting, whereas canny fine-tuning is subjective.
|
66 |
+
* Would the canny model actually benefit from a edpf pre-processor and we might not even require a edpf model?
|
67 |
+
* When evaluating human images we need to be aware of Stable Diffusion's inherent limits, like disformed faces and hands.
|
68 |
+
* When evaluating style we need to be aware of the bias from the image dataset (laion2b-en-aesthetics65), which might tend to generate "aesthetic" images, and not actually work "intrisically better".
|
69 |
+
|
70 |
# Versions
|
71 |
|
72 |
+
**Experiment 1 - 2023-09-19 - control-edgedrawing-default-drop50-fp16-checkpoint-40000**
|
73 |
+
|
74 |
+
Images converted with https://github.com/shaojunluo/EDLinePython (based on original (non-parameter free) edge drawing). Default settings are:
|
75 |
+
|
76 |
+
`smoothed=False`
|
77 |
+
|
78 |
+
```
|
79 |
+
{ 'ksize' : 5
|
80 |
+
, 'sigma' : 1.0
|
81 |
+
, 'gradientThreshold': 36
|
82 |
+
, 'anchorThreshold' : 8
|
83 |
+
, 'scanIntervals' : 1
|
84 |
+
}
|
85 |
+
```
|
86 |
+
|
87 |
+
additional arguments: `--proportion_empty_prompts=0.5`.
|
88 |
|
89 |
+
Trained for 40000 steps with default settings => empty prompts were probably too excessive
|
90 |
|
91 |
+
Update 2023-09-22: bug in algorithm produces too sparse images on default, see https://github.com/shaojunluo/EDLinePython/issues/4
|
92 |
+
|
93 |
+
**Experiment 2 - 2023-09-20 - control-edgedrawing-default-noisy-drop0-fp16-checkpoint-40000**
|
94 |
+
|
95 |
+
Same as experiment 1 with `smoothed=True` and `--proportion_empty_prompts=0`.
|
96 |
+
|
97 |
+
Trained for 40000 steps with default settings => conditioning images are too noisy
|
98 |
+
|
99 |
+
**Experiment 3.0 - 2023-09-22 - control-edgedrawing-cv480edpf-drop0-fp16-checkpoint-45000**
|
100 |
|
101 |
Conditioning images generated with [edpf.py](https://gitlab.com/-/snippets/3601881) using [opencv-contrib-python::ximgproc::EdgeDrawing](https://docs.opencv.org/4.8.0/d1/d1c/classcv_1_1ximgproc_1_1EdgeDrawing.html).
|
102 |
|
|
|
109 |
edge_map = ed.getEdgeImage(edges)
|
110 |
```
|
111 |
|
112 |
+
45000 steps => This is **version 0.1 on civitai**.
|
113 |
|
114 |
+
**Experiment 3.1 - 2023-09-24 - control-edgedrawing-cv480edpf-drop0-fp16-checkpoint-90000**
|
115 |
|
116 |
+
90000 steps (45000 steps on original, 45000 steps with left-right flipped images) => quality became better, might release as 0.2 on civitai.
|
117 |
|
118 |
+
**Experiment 3.2 - 2023-09-24 -control-edgedrawing-cv480edpf-drop0+50-fp16-checkpoint-118000**
|
|
|
|
|
|
|
|
|
|
|
|
|
119 |
|
120 |
+
resumed with epoch 2 from 90000 using `--proportion_empty_prompts=0.5` => results became worse, CN didn't pick up on no-prompts (I also tried intermediate checkpoint-104000). restarting with 50% drop.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
121 |
|
122 |
+
**Experiment 4.0 - 2023-09-25 - control-edgedrawing-cv480edpf-drop50-fp16-checkpoint-45000**
|
123 |
|
124 |
+
see experiment 3.0. restarted from 0 with `--proportion_empty_prompts=0.5` =>
|
125 |
|
126 |
+
**Experiment 4.1 - control-edgedrawing-cv480edpf-drop50-fp16-checkpoint-45000**
|
127 |
|
128 |
+
# Ideas
|
129 |
|
130 |
+
* fine-tune off canny
|
131 |
+
* cleanup image dataset (l65)
|
132 |
+
* uncropped mod64 images
|
133 |
+
* integrate edcolor
|
134 |
+
* bigger image dataset (gcc)
|
135 |
+
* cleanup image dataset (gcc)
|
136 |
+
* re-train with fp32
|
137 |
|
138 |
# Question and answers
|
139 |
|
config.json
ADDED
@@ -0,0 +1,51 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
{
|
2 |
+
"_class_name": "ControlNetModel",
|
3 |
+
"_diffusers_version": "0.22.0.dev0",
|
4 |
+
"_name_or_path": "control-edgedrawing-pf-drop0-fp16/checkpoint-71000",
|
5 |
+
"act_fn": "silu",
|
6 |
+
"addition_embed_type": null,
|
7 |
+
"addition_embed_type_num_heads": 64,
|
8 |
+
"addition_time_embed_dim": null,
|
9 |
+
"attention_head_dim": 8,
|
10 |
+
"block_out_channels": [
|
11 |
+
320,
|
12 |
+
640,
|
13 |
+
1280,
|
14 |
+
1280
|
15 |
+
],
|
16 |
+
"class_embed_type": null,
|
17 |
+
"conditioning_channels": 3,
|
18 |
+
"conditioning_embedding_out_channels": [
|
19 |
+
16,
|
20 |
+
32,
|
21 |
+
96,
|
22 |
+
256
|
23 |
+
],
|
24 |
+
"controlnet_conditioning_channel_order": "rgb",
|
25 |
+
"cross_attention_dim": 768,
|
26 |
+
"down_block_types": [
|
27 |
+
"CrossAttnDownBlock2D",
|
28 |
+
"CrossAttnDownBlock2D",
|
29 |
+
"CrossAttnDownBlock2D",
|
30 |
+
"DownBlock2D"
|
31 |
+
],
|
32 |
+
"downsample_padding": 1,
|
33 |
+
"encoder_hid_dim": null,
|
34 |
+
"encoder_hid_dim_type": null,
|
35 |
+
"flip_sin_to_cos": true,
|
36 |
+
"freq_shift": 0,
|
37 |
+
"global_pool_conditions": false,
|
38 |
+
"in_channels": 4,
|
39 |
+
"layers_per_block": 2,
|
40 |
+
"mid_block_scale_factor": 1,
|
41 |
+
"norm_eps": 1e-05,
|
42 |
+
"norm_num_groups": 32,
|
43 |
+
"num_attention_heads": null,
|
44 |
+
"num_class_embeds": null,
|
45 |
+
"only_cross_attention": false,
|
46 |
+
"projection_class_embeddings_input_dim": null,
|
47 |
+
"resnet_time_scale_shift": "default",
|
48 |
+
"transformer_layers_per_block": 1,
|
49 |
+
"upcast_attention": false,
|
50 |
+
"use_linear_projection": false
|
51 |
+
}
|
control-edgedrawing-cv480edpf-drop0-fp16-checkpoint-45000.zip
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
-
size
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:431d99dc147ab50c2672093f217d48d1f3735d071c471d4a017559bffe013cd9
|
3 |
+
size 5582414
|
control-edgedrawing-cv480edpf-drop0-fp16-checkpoint-90000.zip
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
-
size
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:fdb6844d462fd0d60fd4f57cd8de07f6b2861f218f2a415ff26f2271c96cd044
|
3 |
+
size 4928594
|
control-edgedrawing-cv480edpf-drop50-fp16-checkpoint-45000.safetensors
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:322df64c934f6ffd7fd58ad79baa07aa8de34b42f3d7b777e77ffb915cdf1808
|
3 |
+
size 722598616
|
control-edgedrawing-cv480edpf-drop50-fp16-checkpoint-45000.zip
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:efd2831e49f4af3564291cb7db60dd1d73b9afd87e734e8e802f8467a6133869
|
3 |
+
size 5777208
|
eval_canny_canny.zip
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:193a684863a49d0f9a2b2d89b930483374ff9d472af174093823336c63c977ad
|
3 |
+
size 4512344
|
eval_canny_edpf.zip
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:94fb8603912563a8ccab5efe58bbe7105829f7619cbeff10330429ac23b337d9
|
3 |
+
size 3887066
|
eval_input.zip
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:9c509718d6c0cc0e234c3e402a6f632afde6d9bdd27a49afb7b25246a4ec1df1
|
3 |
+
size 3617968
|