File size: 4,970 Bytes
2da06d4 0ea471e f859386 6f9eef1 b945139 6f9eef1 b945139 6f9eef1 2da06d4 f859386 b0b2a38 f859386 6f9eef1 f859386 86875ae f859386 5112524 f859386 86875ae da991f5 86875ae f859386 0f1ea16 f859386 0ea471e |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 |
---
license: openrail++
tags:
- text-to-image
- stable-diffusion
- diffusers
widget:
- text: 2boys, multiple boys, yaoi, looking at another, hand on another's shoulder, smile, short hair, black hair, closed eyes, brown hair, blue eyes, shirt, lens flare, sky, cloud, blue sky, sweat, best quality, amazing quality, best aesthetic, absurdres, year 2023
parameters:
negative_prompt: lowres, (bad:1.05), text, error, missing, extra, fewer, cropped, jpeg artifacts, worst quality, bad quality, watermark, bad aesthetic, unfinished, chromatic aberration, scan, scan artifacts, 1girl, breasts, realistic, lips, nose
width: 1024
height: 1024
guidance_scale: 5
num_inference_steps: 28
output:
url: images/sample01.png
example_title: sample1
---
# AnimeBoysXL v1.0
**It takes substantial time and efforts to bake models. If you appreciate my models, I would be grateful if you could support me on [Ko-fi](https://ko-fi.com/koolchh) ☕.**
<Gallery />
## Features
- ✔️ **Good for inference**: AnimeBoysXL is a flexible model which is good at generating images of anime boys and males-only content in a wide range of styles.
- ✔️ **Good for training**: AnimeBoysXL is suitable for further training, thanks to its neutral style and ability to recognize a great deal of concepts. Feel free to train your own anime boy model/LoRA from AnimeBoysXL.
- ❌ AnimeBoysXL is not optimized for creating anime girls. Please consider using other models for that purpose.
## Inference Guide
- **Prompt**: Use tag-based prompts to describe your subject.
- Append `, best quality, amazing quality, best aesthetic, absurdres` to the prompt to improve image quality.
- (*Optional*) Append `, year YYYY` to the prompt to shift the output toward the prevalent style of that year. `YYYY` is a 4 digit year, e.g. `, year 2023`
- **Negative prompt**: Choose from one of the following two presets.
1. Heavy (*recommended*): `lowres, (bad:1.05), text, error, missing, extra, fewer, cropped, jpeg artifacts, worst quality, bad quality, watermark, bad aesthetic, unfinished, chromatic aberration, scan, scan artifacts, 1girl, breasts`
2. Light: `lowres, jpeg artifacts, worst quality, watermark, blurry, bad aesthetic, 1girl, breasts`
- (*Optional*) Add `, realistic, lips, nose` to the negative prompt if you need a flat anime-like style face.
- **VAE**: Make sure you're using [SDXL VAE](https://huggingface.co/stabilityai/sdxl-vae/tree/main).
- **Sampling method, sampling steps and CFG scale**: I find **(Euler a, 28, 5)** good. You are encouraged to experiment with other settings.
- **Width and height**: **832*1216** for portrait, **1024*1024** for square, and **1216*832** for landscape.
## 🧨Diffusers Example Usage
```python
import torch
from diffusers import DiffusionPipeline
pipe = DiffusionPipeline.from_pretrained("Koolchh/AnimeBoysXL-v1.0", torch_dtype=torch.float16, use_safetensors=True, variant="fp16")
pipe.to("cuda")
prompt = ", best quality, amazing quality, best aesthetic, absurdres"
negative_prompt = "lowres, (bad:1.05), text, error, missing, extra, fewer, cropped, jpeg artifacts, worst quality, bad quality, watermark, bad aesthetic, unfinished, chromatic aberration, scan, scan artifacts, 1girl, breasts"
image = pipe(
prompt=prompt,
negative_prompt=negative_prompt,
width=1024,
height=1024,
guidance_scale=5,
num_inference_steps=28
).images[0]
```
## Training Details
AnimeBoysXL is trained from [Stable Diffusion XL Base 1.0](https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0), on ~516k images.
The following tags are attached to the training data to make it easier to steer toward either more aesthetic or more flexible results.
### Quality tags
| tag | score |
|-------------------|------------|
| `best quality` | >= 150 |
| `amazing quality` | [100, 150) |
| `great quality` | [75, 100) |
| `normal quality` | [0, 75) |
| `bad quality` | (-5, 0) |
| `worst quality` | <= -5 |
### Aesthetic tags
| tag | score |
|--------------------|--------------|
| `best aesthetic` | >= 6.675 |
| `great aesthetic` | [6.0, 6.675) |
| `normal aesthetic` | [5.0, 6.0) |
| `bad aesthetic` | < 5.0 |
### Rating tags
| tag | rating |
|-----------------|--------------|
| (None) | general |
| `slightly nsfw` | sensitive |
| `fairly nsfw` | questionable |
| `very nsfw` | explicit |
### Year tags
`year YYYY` where `YYYY` is in the range of [2005, 2023].
### Training configurations
- Hardware: 4 * Nvidia A100 80GB GPUs
- Optimizer: AdaFactor
- Gradient accumulation steps: 8
- Batch size: 4 * 8 * 4 = 128
- Learning rates:
- 8e-6 for U-Net
- 5.2e-6 for text encoder 1 (CLIP ViT-L)
- 4.8e-6 for text encoder 2 (OpenCLIP ViT-bigG)
- Learning rate schedule: constant with 250 warmup steps
- Mixed precision training type: BF16
- Epochs: 20 |