lora-training / michiru /README.md
khanon's picture
initial commit
7529c6f

Chidori Michiru (Blue Archive)

Usage

Use any or all of these tags to summon Koharu: michiru, 1girl, halo, yellow eyes, grey hair, small breasts Tag pruning means you don't need to prompt for much other than michiru to get a pretty good result -- eyes and hair are optional but can help correct occasional mistakes. The AI really tries to do her ninja kuji-in but unsurprisingly fucks it up more often than not, holding up the wrong number of fingers or just generating messed up hands. The tail is a bit iffy -- I tagged it as tail but prompting for that gets a cat tail half the time. Try raccoon tail.

For her normal outfit: school uniform, blue skirt, black pantyhose, floral print, black scarf, bridal gauntlets I tagged images with sarashi even when it was only partially visible under her uniform.

Unlike Koharu's LoRA I was a little more specific with clothing colors when tagging. Skirts are always tagged blue skirt, scarf is always black scarf etc. The hope was that it would make it possible to get her normal clothing in a different color, and to avoid overfitting skirt and scarf into the exact ones she normally wears. It sorta works but if you want ared skirt you have to emphasize it and negative prompt blue skirt.

Weights from 0.8 - 1.05 should work well, it's perhaps slightly overtrained. Included both epoch 3 and epoch 4, epoch 3 might actually be a bit better.

Training

All parameters are provided in the accompanying JSON files.

  • Trained on a curated set of 88 images repeated 10 times.
    • Dataset included a mixture of SFW and NSFW.
    • This dataset was smaller than Koharu's, but I maintained the same number of steps.
  • Initially tagged with WD1.4, then performed heavy pruning and editing.
    • Removed as many inaccurate tags as possible
    • Made sure important traits were present and consitently described, and traits like halo were consistent with actual visibility
    • Pruned redundant tags and simplified outfits so that they were always tagged with the same handful of tags
    • Added camera angles and image composition hints
    • Added a few facial expressions
  • Different learning rate than usual.
    • 5e-5 text encoder (same as Koharu's, but typically 1e-5 ~ 2e-5)
    • 3e-4 UNet (typically one order of magnitude faster than text)
  • Trained without VAE.