File size: 4,271 Bytes
2da06d4 0ea471e f859386 2da06d4 f859386 b0b2a38 f859386 86875ae f859386 5112524 f859386 86875ae da991f5 86875ae f859386 0f1ea16 f859386 0ea471e |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 |
---
license: openrail++
tags:
- text-to-image
- stable-diffusion
- diffusers
---
# AnimeBoysXL v1.0
**It takes substantial time and efforts to bake models. If you appreciate my models, I would be grateful if you could support me on [Ko-fi](https://ko-fi.com/koolchh) ☕.**
## Features
- ✔️ **Good for inference**: AnimeBoysXL is a flexible model which is good at generating images of anime boys and males-only content in a wide range of styles.
- ✔️ **Good for training**: AnimeBoysXL is suitable for further training, thanks to its neutral style and ability to recognize a great deal of concepts. Feel free to train your own anime boy model/LoRA from AnimeBoysXL.
- ❌ AnimeBoysXL is not optimized for creating anime girls. Please consider using other models for that purpose.
## Inference Guide
- **Prompt**: Use tag-based prompts to describe your subject.
- Append `, best quality, amazing quality, best aesthetic, absurdres` to the prompt to improve image quality.
- (*Optional*) Append `, year YYYY` to the prompt to shift the output toward the prevalent style of that year. `YYYY` is a 4 digit year, e.g. `, year 2023`
- **Negative prompt**: Choose from one of the following two presets.
1. Heavy (*recommended*): `lowres, (bad:1.05), text, error, missing, extra, fewer, cropped, jpeg artifacts, worst quality, bad quality, watermark, bad aesthetic, unfinished, chromatic aberration, scan, scan artifacts, 1girl, breasts`
2. Light: `lowres, jpeg artifacts, worst quality, watermark, blurry, bad aesthetic, 1girl, breasts`
- (*Optional*) Add `, realistic, lips, nose` to the negative prompt if you need a flat anime-like style face.
- **VAE**: Make sure you're using [SDXL VAE](https://huggingface.co/stabilityai/sdxl-vae/tree/main).
- **Sampling method, sampling steps and CFG scale**: I find **(Euler a, 28, 5)** good. You are encouraged to experiment with other settings.
- **Width and height**: **832*1216** for portrait, **1024*1024** for square, and **1216*832** for landscape.
## 🧨Diffusers Example Usage
```python
import torch
from diffusers import DiffusionPipeline
pipe = DiffusionPipeline.from_pretrained("Koolchh/AnimeBoysXL-v1.0", torch_dtype=torch.float16, use_safetensors=True, variant="fp16")
pipe.to("cuda")
prompt = ", best quality, amazing quality, best aesthetic, absurdres"
negative_prompt = "lowres, (bad:1.05), text, error, missing, extra, fewer, cropped, jpeg artifacts, worst quality, bad quality, watermark, bad aesthetic, unfinished, chromatic aberration, scan, scan artifacts, 1girl, breasts"
image = pipe(
prompt=prompt,
negative_prompt=negative_prompt,
width=1024,
height=1024,
guidance_scale=5,
num_inference_steps=28
).images[0]
```
## Training Details
AnimeBoysXL is trained from [Stable Diffusion XL Base 1.0](https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0), on ~516k images.
The following tags are attached to the training data to make it easier to steer toward either more aesthetic or more flexible results.
### Quality tags
| tag | score |
|-------------------|------------|
| `best quality` | >= 150 |
| `amazing quality` | [100, 150) |
| `great quality` | [75, 100) |
| `normal quality` | [0, 75) |
| `bad quality` | (-5, 0) |
| `worst quality` | <= -5 |
### Aesthetic tags
| tag | score |
|--------------------|--------------|
| `best aesthetic` | >= 6.675 |
| `great aesthetic` | [6.0, 6.675) |
| `normal aesthetic` | [5.0, 6.0) |
| `bad aesthetic` | < 5.0 |
### Rating tags
| tag | rating |
|-----------------|--------------|
| (None) | general |
| `slightly nsfw` | sensitive |
| `fairly nsfw` | questionable |
| `very nsfw` | explicit |
### Year tags
`year YYYY` where `YYYY` is in the range of [2005, 2023].
### Training configurations
- Hardware: 4 * Nvidia A100 80GB GPUs
- Optimizer: AdaFactor
- Gradient accumulation steps: 8
- Batch size: 4 * 8 * 4 = 128
- Learning rates:
- 8e-6 for U-Net
- 5.2e-6 for text encoder 1 (CLIP ViT-L)
- 4.8e-6 for text encoder 2 (OpenCLIP ViT-bigG)
- Learning rate schedule: constant with 250 warmup steps
- Mixed precision training type: BF16
- Epochs: 20 |