Koolchh
/

AnimeBoysXL-v3.0

 ---
 license: openrail++
+tags:
+- text-to-image
+- stable-diffusion
+- diffusers
 ---
+# AnimeBoysXL v3.0
+**It takes substantial time and efforts to bake models. If you appreciate my models, I would be grateful if you could support me on [Ko-fi](https://ko-fi.com/koolchh) ☕.**
+## Features
+- ✔️ **Good for inference**: AnimeBoysXL v3.0 is a flexible model which is good at generating images of anime boys and males-only content in a wide range of styles.
+- ✔️ **Good for training**: AnimeBoysXL v3.0 is suitable for further training, thanks to its neutral style and ability to recognize a great deal of concepts. Feel free to train your own anime boy model/LoRA from AnimeBoysXL.
+- ❌ AnimeBoysXL v3.0 is not optimized for creating anime girls. Please consider using other models for that purpose.
+## Inference Guide
+- **Prompt**: Use tag-based prompts to describe your subject.
+  - Tag ordering matters. It is highly recommended to structure your prompt with the following templates:
+    ```
+    1boy, male focus, character name, series name, anything else you'd like to describe
+    ```
+    ```
+    2boys, male focus, multiple boys, character name(s), series name, anything else you'd like to describe
+    ```
+  - Append
+    ```
+    , best quality, amazing quality, best aesthetic, amazing aesthetic, absurdres
+    ```
+    to the prompt to improve image quality.
+  - (*Optional*) Append
+    ```
+    , year YYYY
+    ```
+    to the prompt to shift the output toward the prevalent style of that year. `YYYY` is a 4 digit year, e.g. `, year 2023`
+- **Negative prompt**: Choose from one of the following two presets.
+  1. Heavy (*recommended*):
+    ```
+    lowres, bad, text, error, missing, extra, fewer, cropped, jpeg artifacts, worst quality, bad quality, watermark, bad aesthetic, unfinished, chromatic aberration, scan, scan artifacts
+    ```
+  2. Light:
+    ```
+    lowres, jpeg artifacts, worst quality, watermark, blurry, bad aesthetic
+    ```
+- **VAE**: Make sure you're using [SDXL VAE](https://huggingface.co/stabilityai/sdxl-vae/tree/main).
+- **Sampling method, sampling steps and CFG scale**: I find **(Euler a, 28, 5)** good. You are encouraged to experiment with other settings.
+- **Width and height**: **832*1216** for portrait, **1024*1024** for square, and **1216*832** for landscape.
+## 🧨Diffusers Example Usage
+```python
+import torch
+from diffusers import DiffusionPipeline
+pipe = DiffusionPipeline.from_pretrained("Koolchh/AnimeBoysXL-v3.0", torch_dtype=torch.float16, use_safetensors=True, variant="fp16")
+pipe.to("cuda")
+prompt = "1boy, male focus, shirt, solo, looking at viewer, smile, black hair, brown eyes, short hair, best quality, amazing quality, best aesthetic, amazing aesthetic, absurdres"
+negative_prompt = "lowres, bad, text, error, missing, extra, fewer, cropped, jpeg artifacts, worst quality, bad quality, watermark, bad aesthetic, unfinished, chromatic aberration, scan, scan artifacts"
+image = pipe(
+    prompt=prompt,
+    negative_prompt=negative_prompt,
+    width=1024,
+    height=1024,
+    guidance_scale=5,
+    num_inference_steps=28
+).images[0]
+```
+## Training Details
+AnimeBoysXL v3.0 is trained from [Pony Diffusion V6 XL](https://civitai.com/models/257749/pony-diffusion-v6-xl), on ~516k images.
+The following tags are attached to the training data to make it easier to steer toward either more aesthetic or more flexible results.
+### Quality tags
+| tag               | score     |
+|-------------------|-----------|
+| `best quality`    | >= 150    |
+| `amazing quality` | [75, 150) |
+| `great quality`   | [25, 75)  |
+| `normal quality`  | [0, 25)   |
+| `bad quality`     | (-5, 0)   |
+| `worst quality`   | <= -5     |
+### Aesthetic tags
+| tag                 |
+|---------------------|
+| `best aesthetic`    |
+| `amazing aesthetic` |
+| `great aesthetic`   |
+| `normal aesthetic`  |
+| `bad aesthetic`     |
+### Rating tags
+| tag             | rating       |
+|-----------------|--------------|
+| `sfw`           | general      |
+| `slightly nsfw` | sensitive    |
+| `fairly nsfw`   | questionable |
+| `very nsfw`     | explicit     |
+### Year tags
+`year YYYY` where `YYYY` is in the range of [2005, 2023].
+### Training configurations
+- Hardware: 4 * Nvidia A100 80GB GPUs
+- Optimizer: AdaFactor
+- Gradient accumulation steps: 8
+- Batch size: 4 * 8 * 4 = 128
+- Learning rates:
+  - 8e-6 for U-Net
+  - 5.2e-6 for text encoder 1 (CLIP ViT-L)
+  - 4.8e-6 for text encoder 2 (OpenCLIP ViT-bigG)
+- Learning rate schedule: constant with 250 warmup steps
+- Mixed precision training type: FP16
+- Epochs: 40
+### Changes from v2.0
+- Change the base model from Stable Diffusion XL Base 1.0 to Pony Diffusion V6 XL.
+- Revamp the dataset's aesthetic tag based on the developer's preference.
+- Update quality score and aesthetic score criteria.
+- Use FP16 mixed-precision training.
+- Train for more epochs.