alea31415 commited on
Commit
1bbde23
·
1 Parent(s): 066842f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +73 -0
README.md CHANGED
@@ -1,3 +1,76 @@
1
  ---
2
  license: creativeml-openrail-m
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: creativeml-openrail-m
3
  ---
4
+
5
+ ---
6
+ license: creativeml-openrail-m
7
+ ---
8
+
9
+ This is a low-quality bocchi-the-rock (ぼっち・ざ・ろっく!) character model.
10
+ Similar to my [yama-no-susume model](https://huggingface.co/alea31415/yama-no-susume), this model is capable of generating **multi-character scenes** beyond images of a single character.
11
+ Of course, the result is still hit-or-miss, but I think the success rate of getting the **entire Kessoku Band** right in one shot is already quite high,
12
+ and otherwise, you can always rely on inpainting.
13
+ Here are two examples:
14
+
15
+ With inpainting
16
+ *Coming soon*
17
+
18
+ Without inpainting
19
+ *Coming soon*
20
+
21
+
22
+ ### Characters
23
+
24
+ The model knows 12 characters from bocchi the rock.
25
+ The ressemblance with a character can be improved by a better description of their appearance.
26
+
27
+ *Coming soon*
28
+
29
+ ### Dataset description
30
+
31
+ The dataset contains around 27K images with the following composition
32
+ - 7024 anime screenshots
33
+ - 1630 fan arts
34
+ - 18519 customized regularization images
35
+
36
+ The model is trained with a specific weighting scheme to balance between different concepts.
37
+ For example, the above three categories have weights respectively 0.3, 0.25, and 0.45.
38
+ Each category is itself split into many sub-categories in a hierarchical way.
39
+ For more details on the data preparation process please refer to https://github.com/cyber-meow/anime_screenshot_pipeline
40
+
41
+
42
+ ### Training Details
43
+
44
+ #### Trainer
45
+ The model is trained using [EveryDream1](https://github.com/victorchall/EveryDream-trainer) as
46
+ EveryDream seems to be the only trainer out there that supports sample weighting (through the use of `multiply.txt`).
47
+ Note that for future training it makes sense to migrate to [EveryDream2](https://github.com/victorchall/EveryDream2trainer).
48
+
49
+ #### Hardware and cost
50
+ The model is trained on runpod using 3090 and cost me around 15 dollors.
51
+
52
+ #### Hyperparameter specification
53
+
54
+ - The model is trained for 48000 steps, at batch size 4, lr 1e-6, resolution 512, and conditional dropping rate of 10%.
55
+
56
+ Note that as a consequence of the weighting scheme which translates into a number of different multiply for each image,
57
+ the count of repeat and epoch has a quite different meaning here.
58
+ For example, depending on the weighting, I have around 300K images (some images are used multiple times) in an epoch,
59
+ and therefore I did not even finish an entire epoch with the 48000 steps at batch size 4.
60
+
61
+ ### Failures
62
+
63
+ - For the first 24000 steps I use the trigger words `Bfan1` and `Bfan2` for the two fans of Bocchi.
64
+ However, these two words are too similar and the model fails to different characters for these. Therefore I changed Bfan2 to Bofa2 at step 24000.
65
+
66
+
67
+ ### More Example Generations
68
+
69
+ With inpainting
70
+ *Coming soon*
71
+
72
+ Without inpainting
73
+ *Coming soon*
74
+
75
+ Some failure cases
76
+ *Coming soon*