Gerold Meisinger

control-edgedrawing-cv480edpf-drop50-fp16-checkpoint-45000

a141c2f 12 months ago

No virus

6.46 kB

	---
	license: cc-by-nc-sa-4.0
	datasets:
	- ChristophSchuhmann/improved_aesthetics_6.5plus
	language:
	- en
	---

	Controls image generation by edge maps generated with [Edge Drawing](https://github.com/CihanTopal/ED_Lib). Edge Drawing comes in different flavors: original (experiments 1-2), parameter-free (experiments 3+), color (not yet available).

	* Based on my monologs at [github.com - Edge Drawing](https://github.com/lllyasviel/ControlNet/discussions/318)
	* For usage see the model page on [civitai.com - Model](https://civitai.com/models/149740).
	* To generate edpf maps you can use the script [gitlab.com - edpf.py](https://gitlab.com/-/snippets/3601881).
	* For evaluation see the corresponding .zip with images in "files".
	* To run your own evaluations you can use the script [gitlab.com - inference.py](https://gitlab.com/-/snippets/3602096).

	Edge Drawing Parameter Free

	![image/png](https://cdn-uploads.huggingface.co/production/uploads/64c0ec65a2ec8cb2f589233a/jmdCGeMJx4dKFGo44cuEq.png)

	Example

	sampler=UniPC steps=20 cfg=7.5 seed=0 batch=9 model: v1-5-pruned-emaonly.safetensors cherry-picked: 1/9

	prompt: _a detailed high-quality professional photo of swedish woman standing in front of a mirror, dark brown hair, white hat with purple feather_

	![image/png](https://cdn-uploads.huggingface.co/production/uploads/64c0ec65a2ec8cb2f589233a/2PSWsmzLdHeVG-i67S7jF.png)

	Canndy Edge for comparison (default in Automatic1111)

	![image/png](https://cdn-uploads.huggingface.co/production/uploads/64c0ec65a2ec8cb2f589233a/JZTpa-HZfw0NUYnxZ52Iu.png)

	_notice all the missing edges, the noise and artifacts. yuck! ugh!_

	# Image dataset

	* [laion2B-en aesthetics>=6.5 dataset](https://huggingface.co/datasets/ChristophSchuhmann/improved_aesthetics_6.5plus)
	* `--min_image_size 512 --max_aspect_ratio 2 --resize_mode="center_crop" --image_size 512`
	* resulting in 180k images

	# Training

	```
	accelerate launch train_controlnet.py ^
	--pretrained_model_name_or_path="runwayml/stable-diffusion-v1-5" ^
	--output_dir="control-edgedrawing-[version]-fp16/" ^
	--dataset_name="mydataset" ^
	--mixed_precision="fp16" ^
	--resolution=512 ^
	--learning_rate=1e-5 ^
	--train_batch_size=1 ^
	--gradient_accumulation_steps=4 ^
	--gradient_checkpointing ^
	--use_8bit_adam ^
	--enable_xformers_memory_efficient_attention ^
	--set_grads_to_none ^
	--seed=0
	```

	# Evaluation

	To evaluate the model it makes sense to compare it with the original Canny model. Original evaluations and comparisons are available at [ControlNet 1.0 repo](https://github.com/lllyasviel/ControlNet), [ControlNet 1.1 repo](https://github.com/lllyasviel/ControlNet-v1-1-nightly), [ControlNet paper v1](https://arxiv.org/abs/2302.05543v1), [ControlNet paper v2](https://arxiv.org/abs/2302.05543) and [Diffusers implementation](https://huggingface.co/takuma104/controlnet_dev/tree/main). Some points we have to keep in mind when comparing canny with edpf in order not to compare apples with oranges:
	* canny 1.0 model was trained on 3M images, canny 1.1 model on even more, while edpf model so far is only trained on a 180k-360k.
	* canny edge-detector requires parameter tuning while edpf is parameter-free.
	* Do we manually fine-tune canny to find the perfect input image or do we leave it at default? We could argue that "no fine-tuning required" is the usp of edpf and we want to compare in the default setting, whereas canny fine-tuning is subjective.
	* Would the canny model actually benefit from a edpf pre-processor and we might not even require a edpf model?
	* When evaluating human images we need to be aware of Stable Diffusion's inherent limits, like disformed faces and hands.
	* When evaluating style we need to be aware of the bias from the image dataset (laion2b-en-aesthetics65), which might tend to generate "aesthetic" images, and not actually work "intrisically better".

	# Versions

	Experiment 1 - 2023-09-19 - control-edgedrawing-default-drop50-fp16-checkpoint-40000

	Images converted with https://github.com/shaojunluo/EDLinePython (based on original (non-parameter free) edge drawing). Default settings are:

	`smoothed=False`

	```
	{ 'ksize' : 5
	, 'sigma' : 1.0
	, 'gradientThreshold': 36
	, 'anchorThreshold' : 8
	, 'scanIntervals' : 1
	}
	```

	additional arguments: `--proportion_empty_prompts=0.5`.

	Trained for 40000 steps with default settings => empty prompts were probably too excessive

	Update 2023-09-22: bug in algorithm produces too sparse images on default, see https://github.com/shaojunluo/EDLinePython/issues/4

	Experiment 2 - 2023-09-20 - control-edgedrawing-default-noisy-drop0-fp16-checkpoint-40000

	Same as experiment 1 with `smoothed=True` and `--proportion_empty_prompts=0`.

	Trained for 40000 steps with default settings => conditioning images are too noisy

	Experiment 3.0 - 2023-09-22 - control-edgedrawing-cv480edpf-drop0-fp16-checkpoint-45000

	Conditioning images generated with [edpf.py](https://gitlab.com/-/snippets/3601881) using [opencv-contrib-python::ximgproc::EdgeDrawing](https://docs.opencv.org/4.8.0/d1/d1c/classcv_1_1ximgproc_1_1EdgeDrawing.html).

	```
	ed = cv2.ximgproc.createEdgeDrawing()
	params = cv2.ximgproc.EdgeDrawing.Params()
	params.PFmode = True
	ed.setParams(params)
	edges = ed.detectEdges(image)
	edge_map = ed.getEdgeImage(edges)
	```

	45000 steps => This is version 0.1 on civitai.

	Experiment 3.1 - 2023-09-24 - control-edgedrawing-cv480edpf-drop0-fp16-checkpoint-90000

	90000 steps (45000 steps on original, 45000 steps with left-right flipped images) => quality became better, might release as 0.2 on civitai.

	Experiment 3.2 - 2023-09-24 -control-edgedrawing-cv480edpf-drop0+50-fp16-checkpoint-118000

	resumed with epoch 2 from 90000 using `--proportion_empty_prompts=0.5` => results became worse, CN didn't pick up on no-prompts (I also tried intermediate checkpoint-104000). restarting with 50% drop.

	Experiment 4.0 - 2023-09-25 - control-edgedrawing-cv480edpf-drop50-fp16-checkpoint-45000

	see experiment 3.0. restarted from 0 with `--proportion_empty_prompts=0.5` =>

	Experiment 4.1 - control-edgedrawing-cv480edpf-drop50-fp16-checkpoint-45000

	# Ideas

	* fine-tune off canny
	* cleanup image dataset (l65)
	* uncropped mod64 images
	* integrate edcolor
	* bigger image dataset (gcc)
	* cleanup image dataset (gcc)
	* re-train with fp32

	# Question and answers

	Q: What's the point of another edge control net anyway?

	A: 🤷