Transformers
English
controlnet
Inference Endpoints
Gerold Meisinger commited on
Commit
a141c2f
1 Parent(s): d06ca26

control-edgedrawing-cv480edpf-drop50-fp16-checkpoint-45000

Browse files
README.md CHANGED
@@ -1,20 +1,20 @@
1
  ---
2
- license: openrail
3
  datasets:
4
  - ChristophSchuhmann/improved_aesthetics_6.5plus
5
  language:
6
  - en
7
  ---
8
 
9
- Controls image generation by edge maps generated with [EdgeDrawing Parameter-Free](https://github.com/CihanTopal/ED_Lib).
10
 
11
- * Based on my GitHub monologs at [Edge Drawing - a Canny alternative](https://github.com/lllyasviel/ControlNet/discussions/318)
12
- * For usage see the model page on [Civitai.com](https://civitai.com/models/149740).
13
- * To generate edpf maps you can use the script [edpf.py](https://gitlab.com/-/snippets/3601881).
14
  * For evaluation see the corresponding .zip with images in "files".
15
- * To run your own evaluations you can use [inference.py](https://gitlab.com/-/snippets/3602096).
16
 
17
- **EdgeDrawing Parameter-Free**
18
 
19
  ![image/png](https://cdn-uploads.huggingface.co/production/uploads/64c0ec65a2ec8cb2f589233a/jmdCGeMJx4dKFGo44cuEq.png)
20
 
@@ -26,10 +26,12 @@ prompt: _a detailed high-quality professional photo of swedish woman standing in
26
 
27
  ![image/png](https://cdn-uploads.huggingface.co/production/uploads/64c0ec65a2ec8cb2f589233a/2PSWsmzLdHeVG-i67S7jF.png)
28
 
29
- **Canndy Edge Detection (default in Automatic1111)**
30
 
31
  ![image/png](https://cdn-uploads.huggingface.co/production/uploads/64c0ec65a2ec8cb2f589233a/JZTpa-HZfw0NUYnxZ52Iu.png)
32
 
 
 
33
  # Image dataset
34
 
35
  * [laion2B-en aesthetics>=6.5 dataset](https://huggingface.co/datasets/ChristophSchuhmann/improved_aesthetics_6.5plus)
@@ -55,13 +57,46 @@ accelerate launch train_controlnet.py ^
55
  --seed=0
56
  ```
57
 
 
 
 
 
 
 
 
 
 
 
58
  # Versions
59
 
60
- **Experiment 5 - control-edgedrawing-cv480edpf-drop0+50-fp16-checkpoint-118000**
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
61
 
62
- see experiment 4. resumed with epoch 2 from 90000 using `--proportion_empty_prompts=0.5` => results became worse, CN didn't pick up on no-prompts (I also tried checkpoint-104000). restarting with 50% drop.
63
 
64
- **Experiment 4 - control-edgedrawing-cv480edpf-drop0-fp16-checkpoint-90000**
 
 
 
 
 
 
 
 
65
 
66
  Conditioning images generated with [edpf.py](https://gitlab.com/-/snippets/3601881) using [opencv-contrib-python::ximgproc::EdgeDrawing](https://docs.opencv.org/4.8.0/d1/d1c/classcv_1_1ximgproc_1_1EdgeDrawing.html).
67
 
@@ -74,38 +109,31 @@ edges = ed.detectEdges(image)
74
  edge_map = ed.getEdgeImage(edges)
75
  ```
76
 
77
- 90000 steps (45000 steps on original, 45000 steps with left-right flipped images)
78
 
79
- **Experiment 3 - control-edgedrawing-cv480edpf-drop0-fp16-checkpoint-45000**
80
 
81
- see experiment 4. 45000 steps. This is version 0.1 on civitai.
82
 
83
- **Experiment 2 - control-edgedrawing-default-noisy-drop0-fp16-checkpoint-40000**
84
-
85
- Images converted with https://github.com/shaojunluo/EDLinePython
86
-
87
- Default settings are:
88
-
89
- `smoothed=False`
90
 
91
- ```
92
- { 'ksize' : 5
93
- , 'sigma' : 1.0
94
- , 'gradientThreshold': 36
95
- , 'anchorThreshold' : 8
96
- , 'scanIntervals' : 1
97
- }
98
- ```
99
 
100
- `smoothed=True`, but no empty prompts. Trained for 40000 steps with default settings => conditioning images are too noisy.
101
 
102
- **Experiment 1 - control-edgedrawing-default-drop50-fp16-checkpoint-40000**
103
 
104
- Same as experiment 2.
105
 
106
- Update: bug in algorithm produces too sparse images on default, see https://github.com/shaojunluo/EDLinePython/issues/4
107
 
108
- additional arguments: `--proportion_empty_prompts=0.5`. Trained for 40000 steps with default settings => empty prompts were probably too excessive
 
 
 
 
 
 
109
 
110
  # Question and answers
111
 
 
1
  ---
2
+ license: cc-by-nc-sa-4.0
3
  datasets:
4
  - ChristophSchuhmann/improved_aesthetics_6.5plus
5
  language:
6
  - en
7
  ---
8
 
9
+ Controls image generation by edge maps generated with [Edge Drawing](https://github.com/CihanTopal/ED_Lib). Edge Drawing comes in different flavors: original (experiments 1-2), parameter-free (experiments 3+), color (not yet available).
10
 
11
+ * Based on my monologs at [github.com - Edge Drawing](https://github.com/lllyasviel/ControlNet/discussions/318)
12
+ * For usage see the model page on [civitai.com - Model](https://civitai.com/models/149740).
13
+ * To generate edpf maps you can use the script [gitlab.com - edpf.py](https://gitlab.com/-/snippets/3601881).
14
  * For evaluation see the corresponding .zip with images in "files".
15
+ * To run your own evaluations you can use the script [gitlab.com - inference.py](https://gitlab.com/-/snippets/3602096).
16
 
17
+ **Edge Drawing Parameter Free**
18
 
19
  ![image/png](https://cdn-uploads.huggingface.co/production/uploads/64c0ec65a2ec8cb2f589233a/jmdCGeMJx4dKFGo44cuEq.png)
20
 
 
26
 
27
  ![image/png](https://cdn-uploads.huggingface.co/production/uploads/64c0ec65a2ec8cb2f589233a/2PSWsmzLdHeVG-i67S7jF.png)
28
 
29
+ **Canndy Edge for comparison (default in Automatic1111)**
30
 
31
  ![image/png](https://cdn-uploads.huggingface.co/production/uploads/64c0ec65a2ec8cb2f589233a/JZTpa-HZfw0NUYnxZ52Iu.png)
32
 
33
+ _notice all the missing edges, the noise and artifacts. yuck! ugh!_
34
+
35
  # Image dataset
36
 
37
  * [laion2B-en aesthetics>=6.5 dataset](https://huggingface.co/datasets/ChristophSchuhmann/improved_aesthetics_6.5plus)
 
57
  --seed=0
58
  ```
59
 
60
+ # Evaluation
61
+
62
+ To evaluate the model it makes sense to compare it with the original Canny model. Original evaluations and comparisons are available at [ControlNet 1.0 repo](https://github.com/lllyasviel/ControlNet), [ControlNet 1.1 repo](https://github.com/lllyasviel/ControlNet-v1-1-nightly), [ControlNet paper v1](https://arxiv.org/abs/2302.05543v1), [ControlNet paper v2](https://arxiv.org/abs/2302.05543) and [Diffusers implementation](https://huggingface.co/takuma104/controlnet_dev/tree/main). Some points we have to keep in mind when comparing canny with edpf in order not to compare apples with oranges:
63
+ * canny 1.0 model was trained on 3M images, canny 1.1 model on even more, while edpf model so far is only trained on a 180k-360k.
64
+ * canny edge-detector requires parameter tuning while edpf is parameter-free.
65
+ * Do we manually fine-tune canny to find the perfect input image or do we leave it at default? We could argue that "no fine-tuning required" is the usp of edpf and we want to compare in the default setting, whereas canny fine-tuning is subjective.
66
+ * Would the canny model actually benefit from a edpf pre-processor and we might not even require a edpf model?
67
+ * When evaluating human images we need to be aware of Stable Diffusion's inherent limits, like disformed faces and hands.
68
+ * When evaluating style we need to be aware of the bias from the image dataset (laion2b-en-aesthetics65), which might tend to generate "aesthetic" images, and not actually work "intrisically better".
69
+
70
  # Versions
71
 
72
+ **Experiment 1 - 2023-09-19 - control-edgedrawing-default-drop50-fp16-checkpoint-40000**
73
+
74
+ Images converted with https://github.com/shaojunluo/EDLinePython (based on original (non-parameter free) edge drawing). Default settings are:
75
+
76
+ `smoothed=False`
77
+
78
+ ```
79
+ { 'ksize' : 5
80
+ , 'sigma' : 1.0
81
+ , 'gradientThreshold': 36
82
+ , 'anchorThreshold' : 8
83
+ , 'scanIntervals' : 1
84
+ }
85
+ ```
86
+
87
+ additional arguments: `--proportion_empty_prompts=0.5`.
88
 
89
+ Trained for 40000 steps with default settings => empty prompts were probably too excessive
90
 
91
+ Update 2023-09-22: bug in algorithm produces too sparse images on default, see https://github.com/shaojunluo/EDLinePython/issues/4
92
+
93
+ **Experiment 2 - 2023-09-20 - control-edgedrawing-default-noisy-drop0-fp16-checkpoint-40000**
94
+
95
+ Same as experiment 1 with `smoothed=True` and `--proportion_empty_prompts=0`.
96
+
97
+ Trained for 40000 steps with default settings => conditioning images are too noisy
98
+
99
+ **Experiment 3.0 - 2023-09-22 - control-edgedrawing-cv480edpf-drop0-fp16-checkpoint-45000**
100
 
101
  Conditioning images generated with [edpf.py](https://gitlab.com/-/snippets/3601881) using [opencv-contrib-python::ximgproc::EdgeDrawing](https://docs.opencv.org/4.8.0/d1/d1c/classcv_1_1ximgproc_1_1EdgeDrawing.html).
102
 
 
109
  edge_map = ed.getEdgeImage(edges)
110
  ```
111
 
112
+ 45000 steps => This is **version 0.1 on civitai**.
113
 
114
+ **Experiment 3.1 - 2023-09-24 - control-edgedrawing-cv480edpf-drop0-fp16-checkpoint-90000**
115
 
116
+ 90000 steps (45000 steps on original, 45000 steps with left-right flipped images) => quality became better, might release as 0.2 on civitai.
117
 
118
+ **Experiment 3.2 - 2023-09-24 -control-edgedrawing-cv480edpf-drop0+50-fp16-checkpoint-118000**
 
 
 
 
 
 
119
 
120
+ resumed with epoch 2 from 90000 using `--proportion_empty_prompts=0.5` => results became worse, CN didn't pick up on no-prompts (I also tried intermediate checkpoint-104000). restarting with 50% drop.
 
 
 
 
 
 
 
121
 
122
+ **Experiment 4.0 - 2023-09-25 - control-edgedrawing-cv480edpf-drop50-fp16-checkpoint-45000**
123
 
124
+ see experiment 3.0. restarted from 0 with `--proportion_empty_prompts=0.5` =>
125
 
126
+ **Experiment 4.1 - control-edgedrawing-cv480edpf-drop50-fp16-checkpoint-45000**
127
 
128
+ # Ideas
129
 
130
+ * fine-tune off canny
131
+ * cleanup image dataset (l65)
132
+ * uncropped mod64 images
133
+ * integrate edcolor
134
+ * bigger image dataset (gcc)
135
+ * cleanup image dataset (gcc)
136
+ * re-train with fp32
137
 
138
  # Question and answers
139
 
config.json ADDED
@@ -0,0 +1,51 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_class_name": "ControlNetModel",
3
+ "_diffusers_version": "0.22.0.dev0",
4
+ "_name_or_path": "control-edgedrawing-pf-drop0-fp16/checkpoint-71000",
5
+ "act_fn": "silu",
6
+ "addition_embed_type": null,
7
+ "addition_embed_type_num_heads": 64,
8
+ "addition_time_embed_dim": null,
9
+ "attention_head_dim": 8,
10
+ "block_out_channels": [
11
+ 320,
12
+ 640,
13
+ 1280,
14
+ 1280
15
+ ],
16
+ "class_embed_type": null,
17
+ "conditioning_channels": 3,
18
+ "conditioning_embedding_out_channels": [
19
+ 16,
20
+ 32,
21
+ 96,
22
+ 256
23
+ ],
24
+ "controlnet_conditioning_channel_order": "rgb",
25
+ "cross_attention_dim": 768,
26
+ "down_block_types": [
27
+ "CrossAttnDownBlock2D",
28
+ "CrossAttnDownBlock2D",
29
+ "CrossAttnDownBlock2D",
30
+ "DownBlock2D"
31
+ ],
32
+ "downsample_padding": 1,
33
+ "encoder_hid_dim": null,
34
+ "encoder_hid_dim_type": null,
35
+ "flip_sin_to_cos": true,
36
+ "freq_shift": 0,
37
+ "global_pool_conditions": false,
38
+ "in_channels": 4,
39
+ "layers_per_block": 2,
40
+ "mid_block_scale_factor": 1,
41
+ "norm_eps": 1e-05,
42
+ "norm_num_groups": 32,
43
+ "num_attention_heads": null,
44
+ "num_class_embeds": null,
45
+ "only_cross_attention": false,
46
+ "projection_class_embeddings_input_dim": null,
47
+ "resnet_time_scale_shift": "default",
48
+ "transformer_layers_per_block": 1,
49
+ "upcast_attention": false,
50
+ "use_linear_projection": false
51
+ }
control-edgedrawing-cv480edpf-drop0-fp16-checkpoint-45000.zip CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:16d4d36d3a577b1fa18a5f07a80d1bfc8ca10f1461d8c7e45d79f7285d496b3c
3
- size 17597750
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:431d99dc147ab50c2672093f217d48d1f3735d071c471d4a017559bffe013cd9
3
+ size 5582414
control-edgedrawing-cv480edpf-drop0-fp16-checkpoint-90000.zip CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:fb2349062e57cbf5d25e249760f6893c20fb34fb099a98378496e7105f91f07f
3
- size 16943930
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:fdb6844d462fd0d60fd4f57cd8de07f6b2861f218f2a415ff26f2271c96cd044
3
+ size 4928594
control-edgedrawing-cv480edpf-drop50-fp16-checkpoint-45000.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:322df64c934f6ffd7fd58ad79baa07aa8de34b42f3d7b777e77ffb915cdf1808
3
+ size 722598616
control-edgedrawing-cv480edpf-drop50-fp16-checkpoint-45000.zip ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:efd2831e49f4af3564291cb7db60dd1d73b9afd87e734e8e802f8467a6133869
3
+ size 5777208
eval_canny_canny.zip ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:193a684863a49d0f9a2b2d89b930483374ff9d472af174093823336c63c977ad
3
+ size 4512344
eval_canny_edpf.zip ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:94fb8603912563a8ccab5efe58bbe7105829f7619cbeff10330429ac23b337d9
3
+ size 3887066
eval_input.zip ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:9c509718d6c0cc0e234c3e402a6f632afde6d9bdd27a49afb7b25246a4ec1df1
3
+ size 3617968