Update README.md
Browse files
README.md
CHANGED
|
@@ -1,3 +1,51 @@
|
|
| 1 |
-
-
|
| 2 |
-
|
| 3 |
-
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+

|
| 2 |
+
|
| 3 |
+
# Aniimage-1
|
| 4 |
+
|
| 5 |
+
Aniimage-1 is the first latent diffusion model developed by 8BitStudio.
|
| 6 |
+
The model is a 256x256 anime image generation model trained from scratch using a UNet + VAE + CLIP architecture.
|
| 7 |
+
Aniimage-1 has been trained on 830,001 anime images from [Danbooru](https://danbooru.donmai.us/).
|
| 8 |
+
|
| 9 |
+
## Model Details
|
| 10 |
+
|
| 11 |
+
| | |
|
| 12 |
+
|---|---|
|
| 13 |
+
| **Resolution** | 256×256 |
|
| 14 |
+
| **Architecture** | Latent Diffusion (UNet + VAE + CLIP) |
|
| 15 |
+
| **Parameters** | ~400M |
|
| 16 |
+
| **Training Steps** | 88,000 |
|
| 17 |
+
| **Batch Size** | 64 |
|
| 18 |
+
| **Dataset** | ~830K curated anime images from Danbooru |
|
| 19 |
+
| **GPU** | NVIDIA RTX 5060 Ti 16GB |
|
| 20 |
+
| **Scheduler** | DDIM or DPM ++ 2M |
|
| 21 |
+
|
| 22 |
+
## Usage
|
| 23 |
+
|
| 24 |
+
```python
|
| 25 |
+
from generate import Generator
|
| 26 |
+
|
| 27 |
+
gen = Generator()
|
| 28 |
+
gen.load_model("path/to/checkpoint")
|
| 29 |
+
image = gen.generate(prompt="a smiling anime girl with long blue hair", steps=50, cfg_scale=7.0)
|
| 30 |
+
```
|
| 31 |
+
|
| 32 |
+
## Capabilities
|
| 33 |
+
|
| 34 |
+
- Anime character generation with varied hair colors and styles
|
| 35 |
+
- School uniforms, fantasy outfits, maid dresses, and more
|
| 36 |
+
- Background scenes: cherry blossoms, night sky, interiors, nature
|
| 37 |
+
|
| 38 |
+
## Limitations
|
| 39 |
+
|
| 40 |
+
- 256×256 resolution — fine details like hands and small features can be rough
|
| 41 |
+
- Faces can sometimes look similar or 'melty' across different prompts
|
| 42 |
+
- Complex multi-character scenes may have merging issues
|
| 43 |
+
- Little to none NSFW content — trained on mostly SFW dataset only
|
| 44 |
+
|
| 45 |
+
## What's Next
|
| 46 |
+
|
| 47 |
+
**Aniimage-1.5** — a 512×512 fine-tune of this model is currently in development, which will significantly improve detail and clarity.
|
| 48 |
+
|
| 49 |
+
## License
|
| 50 |
+
|
| 51 |
+
Apache 2.0
|