File size: 7,506 Bytes
5806669
 
08a4ba9
 
 
 
7dff6df
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
5806669
08a4ba9
 
 
 
 
7dff6df
 
08a4ba9
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
9f6ffb4
08a4ba9
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
7dff6df
08a4ba9
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
0bc917e
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
---
license: openrail++
tags:
- text-to-image
- stable-diffusion
- diffusers
widget:
- text: 1boy, male focus, japanese clothes, yukata, muscular male, muscular, paw pose, solo, looking at viewer, grin, black hair, blue eyes, short hair, flower, petals, best quality, amazing quality, best aesthetic, absurdres, year 2023
  parameters:
    negative_prompt: lowres, (bad:1.05), text, error, missing, extra, fewer, cropped, jpeg artifacts, worst quality, bad quality, watermark, bad aesthetic, unfinished, chromatic aberration, scan, scan artifacts, 1girl, breasts, realistic, lips, nose
  output:
    url: images/sample01.png
  example_title: sample01
- text: 1boy, male focus, sitting, on couch, couch, crossed legs, hand on own face, white shirt, shirt, black pants, pants, necktie, indoors, solo, looking at viewer, open mouth, white hair, yellow eyes,  short hair, best quality, amazing quality, best aesthetic, absurdres, year 2023
  parameters:
    negative_prompt: lowres, (bad:1.05), text, error, missing, extra, fewer, cropped, jpeg artifacts, worst quality, bad quality, watermark, bad aesthetic, unfinished, chromatic aberration, scan, scan artifacts, 1girl, breasts, realistic, lips, nose
  output:
    url: images/sample02.png
  example_title: sample02
- text: 2boys, male focus, multiple boys, yaoi, imminent kiss, looking at another, smile, short hair, black hair, closed eyes, brown hair, blue eyes, shirt, lens flare, sky, cloud, blue sky, sweat, best quality, amazing quality, best aesthetic, absurdres, year 2023
  parameters:
    negative_prompt: lowres, (bad:1.05), text, error, missing, extra, fewer, cropped, jpeg artifacts, worst quality, bad quality, watermark, bad aesthetic, unfinished, chromatic aberration, scan, scan artifacts, 1girl, breasts, realistic, lips, nose
  output:
    url: images/sample03.png
  example_title: sample03
- text: 1boy, male focus, tank top, black tank top, sidepec, muscular male, muscular, bara, upper body, solo, looking to the side, annoyed, speech bubble, short hair, undercut, stubble, black hair, green eyes, parted lips, white background, simple background, best quality, amazing quality, best aesthetic, absurdres, year 2023
  parameters:
    negative_prompt: lowres, (bad:1.05), text, error, missing, extra, fewer, cropped, jpeg artifacts, worst quality, bad quality, watermark, bad aesthetic, unfinished, chromatic aberration, scan, scan artifacts, 1girl, breasts
  output:
    url: images/sample04.png
  example_title: sample04
---

# AnimeBoysXL v2.0

**It takes substantial time and efforts to bake models. If you appreciate my models, I would be grateful if you could support me on [Ko-fi](https://ko-fi.com/koolchh) ☕.**

<Gallery />

## Features

- ✔️ **Good for inference**: AnimeBoysXL v2.0 is a flexible model which is good at generating images of anime boys and males-only content in a wide range of styles.
- ✔️ **Good for training**: AnimeBoysXL v2.0 is suitable for further training, thanks to its neutral style and ability to recognize a great deal of concepts. Feel free to train your own anime boy model/LoRA from AnimeBoysXL.
- ❌ AnimeBoysXL v2.0 is not optimized for creating anime girls. Please consider using other models for that purpose.

## Inference Guide

- **Prompt**: Use tag-based prompts to describe your subject.
  - Tag ordering matters. It is highly recommended to structure your prompt with the following templates:
    ```
    1boy, male focus, character name, series name, anything else you'd like to describe
    ```
    ```
    2boys, male focus, multiple boys, character name(s), series name, anything else you'd like to describe
    ```
  - Append
    ```
    , best quality, amazing quality, best aesthetic, absurdres
    ```
    to the prompt to improve image quality.
  - (*Optional*) Append
    ```
    , year YYYY
    ```
    to the prompt to shift the output toward the prevalent style of that year. `YYYY` is a 4 digit year, e.g. `, year 2023`
  - For more detailed documentation, you can visit my [article](https://ko-fi.com/post/Advanced-Prompt-Guide-for-AnimeBoysXL-V3-Z8Z2WWYHS) on Ko-fi (available to supporters only).
- **Negative prompt**: Choose from one of the following two presets.
  1. Heavy (*recommended*):
    ```
    lowres, (bad:1.05), text, error, missing, extra, fewer, cropped, jpeg artifacts, worst quality, bad quality, watermark, bad aesthetic, unfinished, chromatic aberration, scan, scan artifacts, 1girl, breasts
    ```
  2. Light:
    ```
    lowres, jpeg artifacts, worst quality, watermark, blurry, bad aesthetic, 1girl, breasts
    ```
  - (*Optional*) Add
    ```
    , realistic, lips, nose
    ```
    to the negative prompt if you need a flat anime-like style face.
- **VAE**: Make sure you're using [SDXL VAE](https://huggingface.co/stabilityai/sdxl-vae/tree/main).
- **Sampling method, sampling steps and CFG scale**: I find **(Euler a, 28, 5)** good. You are encouraged to experiment with other settings.
- **Width and height**: **832*1216** for portrait, **1024*1024** for square, and **1216*832** for landscape.

## 🧨Diffusers Example Usage

```python
import torch
from diffusers import DiffusionPipeline

pipe = DiffusionPipeline.from_pretrained("Koolchh/AnimeBoysXL-v2.0", torch_dtype=torch.float16, use_safetensors=True, variant="fp16")
pipe.to("cuda")

prompt = "1boy, male focus, shirt, solo, looking at viewer, smile, black hair, brown eyes, short hair, best quality, amazing quality, best aesthetic, absurdres"
negative_prompt = "lowres, (bad:1.05), text, error, missing, extra, fewer, cropped, jpeg artifacts, worst quality, bad quality, watermark, bad aesthetic, unfinished, chromatic aberration, scan, scan artifacts, 1girl, breasts"

image = pipe(
    prompt=prompt, 
    negative_prompt=negative_prompt, 
    width=1024,
    height=1024,
    guidance_scale=5,
    num_inference_steps=28
).images[0]
```

## Training Details

AnimeBoysXL v2.0 is trained from [Stable Diffusion XL Base 1.0](https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0), on ~516k images.

The following tags are attached to the training data to make it easier to steer toward either more aesthetic or more flexible results.

### Quality tags

| tag               | score      |
|-------------------|------------|
| `best quality`    | >= 150     |
| `amazing quality` | [100, 150) |
| `great quality`   | [75, 100)  |
| `normal quality`  | [0, 75)    |
| `bad quality`     | (-5, 0)    |
| `worst quality`   | <= -5      |

### Aesthetic tags

| tag                | score        |
|--------------------|--------------|
| `best aesthetic`   | >= 6.675     |
| `great aesthetic`  | [6.0, 6.675) |
| `normal aesthetic` | [5.0, 6.0)   |
| `bad aesthetic`    | < 5.0        |

### Rating tags

| tag             | rating       |
|-----------------|--------------|
| `sfw`           | general      |
| `slightly nsfw` | sensitive    |
| `fairly nsfw`   | questionable |
| `very nsfw`     | explicit     |

### Year tags

`year YYYY` where `YYYY` is in the range of [2005, 2023].

### Training configurations

- Hardware: 4 * Nvidia A100 80GB GPUs
- Optimizer: AdaFactor
- Gradient accumulation steps: 8
- Batch size: 4 * 8 * 4 = 128
- Learning rates:
  - 8e-6 for U-Net
  - 5.2e-6 for text encoder 1 (CLIP ViT-L)
  - 4.8e-6 for text encoder 2 (OpenCLIP ViT-bigG)

### Changes from v1.0
- Train with tag ordering.
- Add `sfw` rating tag.
- More epochs on the questionable and explicit rating subset.
- FP16 mixed-precision training for final epochs.