Text-to-Image
Diffusers
Safetensors
English
StableDiffusionXLPipeline
Inference Endpoints
KBlueLeaf's picture
Update README.md
5a8f9d6 verified
metadata
license: other
license_name: fair-ai-public-license-1.0-sd
license_link: https://freedevproject.org/faipl-1.0-sd/
datasets:
  - KBlueLeaf/danbooru2023-webp-4Mpixel
  - KBlueLeaf/danbooru2023-sqlite
language:
  - en
library_name: diffusers
pipeline_tag: text-to-image

Kohaku XL Epsilon rev2

join us: https://discord.gg/tPBsKDyRR5

Rev2 Features

  • Resumed from Kohaku XL Epsilon rev1
  • 1.56M images, 5epoch
  • Trained on selected artists' artworks and images about selected series/games
  • Trained on PVC figure photos, can generate PVC style without any additional models

Usage (PLEASE READ THIS SECTION)

Prompt Format

<1girl/1boy/1other/...>, <character>, <series>, <artists>, <general tags>, <quality tags>, <year tags>, <meta tags>, <rating tags>

Special Tags

  • Quality tags: masterpiece, best quality, great quality, good quality, normal quality, low quality, worst quality
  • Rating tags: safe, sensitive, nsfw, explicit
  • Date tags: newest, recent, mid, early, old

Rating tags

General: safe Sensitive: sensitive Questionable: nsfw Explicit: nsfw, explicit

Resolution

This model is trained for resolutions from ARB 1024x1024 with minimum resolution 256 and maximum resolution 4096. This means you can use the standard SDXL resolution. However, opting for a slightly higher resolution than 1024x1024 is recommended. Applying a hires-fix is also suggested for better results.

Training

  • Hardware: Quad RTX 3090s
  • Num Train Images: 1,536,902
  • Total Epoch: 5
  • Total Steps: 15015
  • Training Time: 410 hours (wall time)
  • Batch Size: 4
  • Grad Accumulation Step: 32
  • Equivalent Batch Size: 512
  • Optimizer: Lion8bit
  • Learning Rate: 1e-5 for UNet / 2e-6 for TE
  • LR Scheduler: Cosine (with warmup)
  • Warmup Steps: 1000
  • Weight Decay: 0.1
  • Betas: 0.9, 0.95
  • Min SNR Gamma: 5
  • Noise Offset: 0.0357
  • Resolution: 1024x1024
  • Min Bucket Resolution: 256
  • Max Bucket Resolution: 4096
  • Mixed Precision: FP16
  • Caption Tag Dropout: 0.2
  • Caption Dropout: 0.05

License:

Fair-AI-public-1.0-sd