Koolchh commited on
Commit
c699101
1 Parent(s): afa09d1

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +130 -0
README.md CHANGED
@@ -1,3 +1,133 @@
1
  ---
2
  license: openrail++
 
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: openrail++
3
+ tags:
4
+ - text-to-image
5
+ - stable-diffusion
6
+ - diffusers
7
  ---
8
+
9
+ # AnimeBoysXL v3.0
10
+
11
+ **It takes substantial time and efforts to bake models. If you appreciate my models, I would be grateful if you could support me on [Ko-fi](https://ko-fi.com/koolchh) ☕.**
12
+
13
+ ## Features
14
+
15
+ - ✔️ **Good for inference**: AnimeBoysXL v3.0 is a flexible model which is good at generating images of anime boys and males-only content in a wide range of styles.
16
+ - ✔️ **Good for training**: AnimeBoysXL v3.0 is suitable for further training, thanks to its neutral style and ability to recognize a great deal of concepts. Feel free to train your own anime boy model/LoRA from AnimeBoysXL.
17
+ - ❌ AnimeBoysXL v3.0 is not optimized for creating anime girls. Please consider using other models for that purpose.
18
+
19
+ ## Inference Guide
20
+
21
+ - **Prompt**: Use tag-based prompts to describe your subject.
22
+ - Tag ordering matters. It is highly recommended to structure your prompt with the following templates:
23
+ ```
24
+ 1boy, male focus, character name, series name, anything else you'd like to describe
25
+ ```
26
+ ```
27
+ 2boys, male focus, multiple boys, character name(s), series name, anything else you'd like to describe
28
+ ```
29
+ - Append
30
+ ```
31
+ , best quality, amazing quality, best aesthetic, amazing aesthetic, absurdres
32
+ ```
33
+ to the prompt to improve image quality.
34
+ - (*Optional*) Append
35
+ ```
36
+ , year YYYY
37
+ ```
38
+ to the prompt to shift the output toward the prevalent style of that year. `YYYY` is a 4 digit year, e.g. `, year 2023`
39
+ - **Negative prompt**: Choose from one of the following two presets.
40
+ 1. Heavy (*recommended*):
41
+ ```
42
+ lowres, bad, text, error, missing, extra, fewer, cropped, jpeg artifacts, worst quality, bad quality, watermark, bad aesthetic, unfinished, chromatic aberration, scan, scan artifacts
43
+ ```
44
+ 2. Light:
45
+ ```
46
+ lowres, jpeg artifacts, worst quality, watermark, blurry, bad aesthetic
47
+ ```
48
+ - **VAE**: Make sure you're using [SDXL VAE](https://huggingface.co/stabilityai/sdxl-vae/tree/main).
49
+ - **Sampling method, sampling steps and CFG scale**: I find **(Euler a, 28, 5)** good. You are encouraged to experiment with other settings.
50
+ - **Width and height**: **832*1216** for portrait, **1024*1024** for square, and **1216*832** for landscape.
51
+
52
+ ## 🧨Diffusers Example Usage
53
+
54
+ ```python
55
+ import torch
56
+ from diffusers import DiffusionPipeline
57
+
58
+ pipe = DiffusionPipeline.from_pretrained("Koolchh/AnimeBoysXL-v3.0", torch_dtype=torch.float16, use_safetensors=True, variant="fp16")
59
+ pipe.to("cuda")
60
+
61
+ prompt = "1boy, male focus, shirt, solo, looking at viewer, smile, black hair, brown eyes, short hair, best quality, amazing quality, best aesthetic, amazing aesthetic, absurdres"
62
+ negative_prompt = "lowres, bad, text, error, missing, extra, fewer, cropped, jpeg artifacts, worst quality, bad quality, watermark, bad aesthetic, unfinished, chromatic aberration, scan, scan artifacts"
63
+
64
+ image = pipe(
65
+ prompt=prompt,
66
+ negative_prompt=negative_prompt,
67
+ width=1024,
68
+ height=1024,
69
+ guidance_scale=5,
70
+ num_inference_steps=28
71
+ ).images[0]
72
+ ```
73
+
74
+ ## Training Details
75
+
76
+ AnimeBoysXL v3.0 is trained from [Pony Diffusion V6 XL](https://civitai.com/models/257749/pony-diffusion-v6-xl), on ~516k images.
77
+
78
+ The following tags are attached to the training data to make it easier to steer toward either more aesthetic or more flexible results.
79
+
80
+ ### Quality tags
81
+
82
+ | tag | score |
83
+ |-------------------|-----------|
84
+ | `best quality` | >= 150 |
85
+ | `amazing quality` | [75, 150) |
86
+ | `great quality` | [25, 75) |
87
+ | `normal quality` | [0, 25) |
88
+ | `bad quality` | (-5, 0) |
89
+ | `worst quality` | <= -5 |
90
+
91
+ ### Aesthetic tags
92
+
93
+ | tag |
94
+ |---------------------|
95
+ | `best aesthetic` |
96
+ | `amazing aesthetic` |
97
+ | `great aesthetic` |
98
+ | `normal aesthetic` |
99
+ | `bad aesthetic` |
100
+
101
+ ### Rating tags
102
+
103
+ | tag | rating |
104
+ |-----------------|--------------|
105
+ | `sfw` | general |
106
+ | `slightly nsfw` | sensitive |
107
+ | `fairly nsfw` | questionable |
108
+ | `very nsfw` | explicit |
109
+
110
+ ### Year tags
111
+
112
+ `year YYYY` where `YYYY` is in the range of [2005, 2023].
113
+
114
+ ### Training configurations
115
+
116
+ - Hardware: 4 * Nvidia A100 80GB GPUs
117
+ - Optimizer: AdaFactor
118
+ - Gradient accumulation steps: 8
119
+ - Batch size: 4 * 8 * 4 = 128
120
+ - Learning rates:
121
+ - 8e-6 for U-Net
122
+ - 5.2e-6 for text encoder 1 (CLIP ViT-L)
123
+ - 4.8e-6 for text encoder 2 (OpenCLIP ViT-bigG)
124
+ - Learning rate schedule: constant with 250 warmup steps
125
+ - Mixed precision training type: FP16
126
+ - Epochs: 40
127
+
128
+ ### Changes from v2.0
129
+ - Change the base model from Stable Diffusion XL Base 1.0 to Pony Diffusion V6 XL.
130
+ - Revamp the dataset's aesthetic tag based on the developer's preference.
131
+ - Update quality score and aesthetic score criteria.
132
+ - Use FP16 mixed-precision training.
133
+ - Train for more epochs.