gustproof commited on
Commit
3250b65
1 Parent(s): 177b4d2

Create posts/1to2.md

Browse files
Files changed (1) hide show
  1. posts/1to2.md +102 -0
posts/1to2.md ADDED
@@ -0,0 +1,102 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # 1to2: Training Multiple-Subject Models using only Single-Subject Data (Experimental)
2
+
3
+ Updates will be mirrored on both Hugging Face and Civitai.
4
+
5
+ ## Introduction
6
+
7
+ [It has been shown that multiple characters can be trained into the model](https://civitai.com/models/23476/the-idolmster-cinderella-girls-starlight-stage-style-90-characters). A harder task is to create a model that can generate multiple characters simultaneously without modifying the generation pipeline. This document describes a simple technique that has been shown to help generating multiple characters in the same image.
8
+
9
+ ## Method
10
+
11
+ ```
12
+ Requirement: Sets of single-character images
13
+ Steps:
14
+ 1. Train a multi-concept model using the original dataset
15
+ 2. Create an augmentation dataset of joined image pairs from the original dataset
16
+ 3. Train on the augmentation dataset
17
+ ```
18
+
19
+ ## Experiment
20
+
21
+
22
+ ### Setup
23
+
24
+ 3 characters from the game Cinderella Girls are chosen for the experiment. The base model is `anime-final-pruned`. It has been checked that the base model has minimal knowledge of the trained characters.
25
+
26
+ For the captions of the joined images, the template format `CharLeft/CharRight/COMPOSITE, TagsLeft, TagsRight` is used.
27
+
28
+ A LoRA (Hadamard product) is trained using the config file below:
29
+ ```
30
+ [model_arguments]
31
+ v2 = false
32
+ v_parameterization = false
33
+ pretrained_model_name_or_path = "Animefull-final-pruned.ckpt"
34
+
35
+ [additional_network_arguments]
36
+ no_metadata = false
37
+ unet_lr = 0.0005
38
+ text_encoder_lr = 0.0005
39
+ network_module = "lycoris.kohya"
40
+ network_dim = 8
41
+ network_alpha = 1
42
+ network_args = [ "conv_dim=0", "conv_alpha=16", "algo=loha",]
43
+ network_train_unet_only = false
44
+ network_train_text_encoder_only = false
45
+
46
+ [optimizer_arguments]
47
+ optimizer_type = "AdamW8bit"
48
+ learning_rate = 0.0005
49
+ max_grad_norm = 1.0
50
+ lr_scheduler = "cosine"
51
+ lr_warmup_steps = 0
52
+
53
+ [dataset_arguments]
54
+ debug_dataset = false
55
+ # keep token 1
56
+
57
+ [training_arguments]
58
+ output_name = "cg3comp"
59
+ save_precision = "fp16"
60
+ save_every_n_epochs = 1
61
+ train_batch_size = 2
62
+ max_token_length = 225
63
+ mem_eff_attn = false
64
+ xformers = true
65
+ max_train_epochs = 40
66
+ max_data_loader_n_workers = 8
67
+ persistent_data_loader_workers = true
68
+ gradient_checkpointing = false
69
+ gradient_accumulation_steps = 1
70
+ mixed_precision = "fp16"
71
+ clip_skip = 2
72
+ lowram = true
73
+
74
+ [sample_prompt_arguments]
75
+ sample_every_n_epochs = 1
76
+ sample_sampler = "k_euler_a"
77
+
78
+ [saving_arguments]
79
+ save_model_as = "safetensors"
80
+ ```
81
+ For the second stage of training, the batch size was reduced to 2 while keeping other settings identical.
82
+ The training took less than 2 hours on a T4 GPU.
83
+
84
+ ### Results
85
+ (see preview images)
86
+
87
+ ## Limitations
88
+ * This technique doubles the memory/compute requirement
89
+ * Composites can still be generated despite negative prompting
90
+ * Cloned characters seem to become the primary failure mode in place of blended characters
91
+
92
+ ## Related Works
93
+
94
+ Models been trained on datasets based on anime shows have [demonstrated](https://civitai.com/models/21305/) multi-subject capabilty.
95
+ Simply using concepts distant enough such as `1girl, 1boy` [has also been shown to be effective](https://civitai.com/models/17640/).
96
+
97
+ ## Future work
98
+
99
+ Below is a list of ideas yet to be explored
100
+ * Synthetic datasets
101
+ * Regularatization
102
+ * Joint training instaed of sequential