license: cc-by-nc-sa-4.0
datasets:
- ChristophSchuhmann/improved_aesthetics_6.5plus
language:
- en
Controls image generation by edge maps generated with Edge Drawing. Edge Drawing comes in different flavors: original (experiments 1-2), parameter-free (experiments 3+), color (not yet available).
- Based on my monologs at github.com - Edge Drawing
- For usage see the model page on civitai.com - Model.
- To generate edpf maps you can use the script gitlab.com - edpf.py.
- For evaluation see the corresponding .zip with images in "files".
- To run your own evaluations you can use the script gitlab.com - inference.py.
Edge Drawing Parameter Free
Example
sampler=UniPC steps=20 cfg=7.5 seed=0 batch=9 model: v1-5-pruned-emaonly.safetensors cherry-picked: 1/9
prompt: a detailed high-quality professional photo of swedish woman standing in front of a mirror, dark brown hair, white hat with purple feather
Canndy Edge for comparison (default in Automatic1111)
notice all the missing edges, the noise and artifacts. yuck! ugh!
Image dataset
- laion2B-en aesthetics>=6.5 dataset
--min_image_size 512 --max_aspect_ratio 2 --resize_mode="center_crop" --image_size 512
- resulting in 180k images
Training
accelerate launch train_controlnet.py ^
--pretrained_model_name_or_path="runwayml/stable-diffusion-v1-5" ^
--output_dir="control-edgedrawing-[version]-fp16/" ^
--dataset_name="mydataset" ^
--mixed_precision="fp16" ^
--resolution=512 ^
--learning_rate=1e-5 ^
--train_batch_size=1 ^
--gradient_accumulation_steps=4 ^
--gradient_checkpointing ^
--use_8bit_adam ^
--enable_xformers_memory_efficient_attention ^
--set_grads_to_none ^
--seed=0
Evaluation
To evaluate the model it makes sense to compare it with the original Canny model. Original evaluations and comparisons are available at ControlNet 1.0 repo, ControlNet 1.1 repo, ControlNet paper v1, ControlNet paper v2 and Diffusers implementation. Some points we have to keep in mind when comparing canny with edpf in order not to compare apples with oranges:
- canny 1.0 model was trained on 3M images, canny 1.1 model on even more, while edpf model so far is only trained on a 180k-360k.
- canny edge-detector requires parameter tuning while edpf is parameter-free.
- Do we manually fine-tune canny to find the perfect input image or do we leave it at default? We could argue that "no fine-tuning required" is the usp of edpf and we want to compare in the default setting, whereas canny fine-tuning is subjective.
- Would the canny model actually benefit from a edpf pre-processor and we might not even require a edpf model?
- When evaluating human images we need to be aware of Stable Diffusion's inherent limits, like disformed faces and hands.
- When evaluating style we need to be aware of the bias from the image dataset (laion2b-en-aesthetics65), which might tend to generate "aesthetic" images, and not actually work "intrisically better".
Versions
Experiment 1 - 2023-09-19 - control-edgedrawing-default-drop50-fp16-checkpoint-40000
Images converted with https://github.com/shaojunluo/EDLinePython (based on original (non-parameter free) edge drawing). Default settings are:
smoothed=False
{ 'ksize' : 5
, 'sigma' : 1.0
, 'gradientThreshold': 36
, 'anchorThreshold' : 8
, 'scanIntervals' : 1
}
additional arguments: --proportion_empty_prompts=0.5
.
Trained for 40000 steps with default settings => empty prompts were probably too excessive
Update 2023-09-22: bug in algorithm produces too sparse images on default, see https://github.com/shaojunluo/EDLinePython/issues/4
Experiment 2 - 2023-09-20 - control-edgedrawing-default-noisy-drop0-fp16-checkpoint-40000
Same as experiment 1 with smoothed=True
and --proportion_empty_prompts=0
.
Trained for 40000 steps with default settings => conditioning images are too noisy
Experiment 3.0 - 2023-09-22 - control-edgedrawing-cv480edpf-drop0-fp16-checkpoint-45000
Conditioning images generated with edpf.py using opencv-contrib-python::ximgproc::EdgeDrawing.
ed = cv2.ximgproc.createEdgeDrawing()
params = cv2.ximgproc.EdgeDrawing.Params()
params.PFmode = True
ed.setParams(params)
edges = ed.detectEdges(image)
edge_map = ed.getEdgeImage(edges)
45000 steps => This is version 0.1 on civitai.
Experiment 3.1 - 2023-09-24 - control-edgedrawing-cv480edpf-drop0-fp16-checkpoint-90000
90000 steps (45000 steps on original, 45000 steps with left-right flipped images) => quality became better, might release as 0.2 on civitai.
Experiment 3.2 - 2023-09-24 -control-edgedrawing-cv480edpf-drop0+50-fp16-checkpoint-118000
resumed with epoch 2 from 90000 using --proportion_empty_prompts=0.5
=> results became worse, CN didn't pick up on no-prompts (I also tried intermediate checkpoint-104000). restarting with 50% drop.
Experiment 4.0 - 2023-09-25 - control-edgedrawing-cv480edpf-drop50-fp16-checkpoint-45000
see experiment 3.0. restarted from 0 with --proportion_empty_prompts=0.5
=>
Experiment 4.1 - control-edgedrawing-cv480edpf-drop50-fp16-checkpoint-45000
Ideas
- fine-tune off canny
- cleanup image dataset (l65)
- uncropped mod64 images
- integrate edcolor
- bigger image dataset (gcc)
- cleanup image dataset (gcc)
- re-train with fp32
Question and answers
Q: What's the point of another edge control net anyway?
A: 🤷