File size: 5,040 Bytes

---
license: openrail++
tags:
- text-to-image
- stable-diffusion
- diffusers
---

# AnimeBoysXL v3.0

**It takes substantial time and efforts to bake models. If you appreciate my models, I would be grateful if you could support me on [Ko-fi](https://ko-fi.com/koolchh) ☕.**

## Features

- ✔️ **Good for inference**: AnimeBoysXL v3.0 is a flexible model which is good at generating images of anime boys and males-only content in a wide range of styles.
- ✔️ **Good for training**: AnimeBoysXL v3.0 is suitable for further training, thanks to its neutral style and ability to recognize a great deal of concepts. Feel free to train your own anime boy model/LoRA from AnimeBoysXL.
- ❌ AnimeBoysXL v3.0 is not optimized for creating anime girls. Please consider using other models for that purpose.

## Inference Guide

- **Prompt**: Use tag-based prompts to describe your subject.
  - Tag ordering matters. It is highly recommended to structure your prompt with the following templates:
    ```
    1boy, male focus, character name, series name, anything else you'd like to describe, best aesthetic, amazing aesthetic, great aesthetic, normal aesthetic, best quality, amazing quality, great quality, normal quality, absurdres
    ```
    ```
    2boys, male focus, multiple boys, character name(s), series name, anything else you'd like to describe, best aesthetic, amazing aesthetic, great aesthetic, normal aesthetic, best quality, amazing quality, great quality, normal quality, absurdres
    ```
- **Negative prompt**: Choose from one of the following two presets.
  1. Light (*recommended*):
    ```
    lowres, jpeg artifacts, worst quality, watermark, blurry, bad aesthetic
    ```
  2. Heavy:
    ```
    lowres, bad, text, error, missing, extra, fewer, cropped, jpeg artifacts, worst quality, bad quality, watermark, bad aesthetic, unfinished, chromatic aberration, scan, scan artifacts
    ```
- **VAE**: Make sure you're using [SDXL VAE](https://huggingface.co/stabilityai/sdxl-vae/tree/main).
- **Sampling method, sampling steps and CFG scale**: I find **(Euler a, 28, 5)** good. You are encouraged to experiment with other settings.
- **Width and height**: **832*1216** for portrait, **1024*1024** for square, and **1216*832** for landscape.

## 🧨Diffusers Example Usage

```python
import torch
from diffusers import DiffusionPipeline

pipe = DiffusionPipeline.from_pretrained("Koolchh/AnimeBoysXL-v3.0", torch_dtype=torch.float16, use_safetensors=True, variant="fp16")
pipe.to("cuda")

prompt = "1boy, male focus, shirt, solo, looking at viewer, smile, black hair, brown eyes, short hair, best quality, amazing quality, best aesthetic, great quality, amazing aesthetic, great aesthetic, normal aesthetic, normal quality, absurdres"
negative_prompt = "lowres, jpeg artifacts, worst quality, watermark, blurry, bad aesthetic"

image = pipe(
    prompt=prompt, 
    negative_prompt=negative_prompt, 
    width=1024,
    height=1024,
    guidance_scale=5,
    num_inference_steps=28
).images[0]
```

## Training Details

AnimeBoysXL v3.0 is trained from [Pony Diffusion V6 XL](https://civitai.com/models/257749/pony-diffusion-v6-xl), on ~516k images.

The following tags are attached to the training data to make it easier to steer toward either more aesthetic or more flexible results.

### Quality tags

| tag               | score     |
|-------------------|-----------|
| `best quality`    | >= 150    |
| `amazing quality` | [75, 150) |
| `great quality`   | [25, 75)  |
| `normal quality`  | [0, 25)   |
| `bad quality`     | (-5, 0)   |
| `worst quality`   | <= -5     |

### Aesthetic tags

| tag                 |
|---------------------|
| `best aesthetic`    |
| `amazing aesthetic` |
| `great aesthetic`   |
| `normal aesthetic`  |
| `bad aesthetic`     |

### Rating tags

| tag             | rating       |
|-----------------|--------------|
| `sfw`           | general      |
| `slightly nsfw` | sensitive    |
| `fairly nsfw`   | questionable |
| `very nsfw`     | explicit     |

### Year tags

`year YYYY` where `YYYY` is in the range of [2005, 2023].

### Training configurations

- Hardware: 4 * Nvidia A100 80GB GPUs
- Optimizer: AdaFactor
- Gradient accumulation steps: 8
- Batch size: 4 * 8 * 4 = 128
- Learning rates:
  - 8e-6 for U-Net
  - 5.2e-6 for text encoder 1 (CLIP ViT-L)
  - 4.8e-6 for text encoder 2 (OpenCLIP ViT-bigG)
- Learning rate schedule: constant with 250 warmup steps
- Mixed precision training type: FP16
- Epochs: 40

### Changes from v2.0
- Change the base model from Stable Diffusion XL Base 1.0 to Pony Diffusion V6 XL.
- Revamp the dataset's aesthetic tag based on the developer's preference.
- Update quality score and aesthetic score criteria.
- Use FP16 mixed-precision training.
- Train for more epochs.

## License

Since AnimeBoysXL v3.0 is a derivative model of [Pony Diffusion V6 XL](https://civitai.com/models/257749/pony-diffusion-v6-xl) by PurpleSmartAI, it has a different license from the previous versions. Please read their license before using the model.