lora-training / kazusa /tagging methodology.md
khanon's picture
updates preview images
4b7c79c
|
raw
history blame
No virus
7.39 kB
# Tagging methodology for Kazusa (blue archive)
## README / Intro
Since I've seen a few people share this already I'll provide this disclaimer.
This is not really intended to be a guide, it's just a log/checklist of my process, for my own benefit, since I repeat this for a lot of LoRAs and I got tired of winging it every single time. I've put only the slightest amount of effort into making it accessible to others.
I don't claim that any or all of these optimal, nor can I confidently put them forth as recommendations. They're literally just a record of the steps I follow while tagging, gradually developed after ~16 characters using some version of the below process.
Still, I can at least point to my pre-Koharu LoRAs (which used pure WD1.4 tags) and the ones that came after (where I started heavily editing tags) and see a steady progression in quality and prompting flexibility despite using mostly the same training settings for each one.
Yes, it takes forever to do all of this shit. No, I don't recommend it unless you're extremely autistic; raw WD1.4 tags are probably good enough for most people. If you intend to do this for more than a few characters, I strongly recommend learning [Hydrus](https://hydrusnetwork.github.io/hydrus/introduction.html) it makes all of this way, way less tedious compared to doing it with crappier tools.
---
## Prep
- Scraped `1girl kazusa_(blue_archive) order:popularity` from sancom, curated for quality, then exported from Hydrus to feed into WD1.4 Tagger.
- Kazusa has a shitload of good art so I had to be very picky to get down to 280 images, which is still a lot. In hindsight I think huge datasets aren't really a problem; they let you train for longer without overfitting.
- Gelbooru is probably fine too. Danbooru sucks for ロリ unless you have Gold.
- I also got a few newer images from pixiv, don't remember which ones.
- Exported final images from Hydrus to feed into WD1.4 Tagger
- Auto-tagged with WD1.4 Swinv2 at 0.25 confidence
- Reimported images+tags into Hydrus using the .txt sidecar feature. I strongly recommend putting WD1.4 tags in a separate tag domain so they aren't mixed in with shit scraped from boorus.
## Tagging
- Tag unique features
- `halo` / `demon horns` / `low wings`
- Remove when not present or out of view. WD1.4 likes putting `halo` even on images where no halo is visible.
- **Kazusa**: `halo` / `animal ears`
- Pruned `extra ears` as it seems redundant and intrinsic to the character.
- Tag outfit variants with a single master tag
- **Kazusa**:
- Uniform: `school uniform` / `black jacket`
- Sometimes the jacket appears without anything else, which was not tagged `school uniform`
- Non-canon costumes
- Add `alternate costume`
- Nudity (WD1.4 usually does this accurately)
- `nude` / `completely nude`
- Prune eye colors
- Keep tags which describe unusual eye features (`multicolored eyes`, `heterochromia`, `slit pupils`) as they can otherwise be too subtle and inconsistently drawn for the AI to notice
- Prune hair colors
- This includes `two-toned hair`, `gradiant hair`, etc. The AI learns all of these very consistently without the tags, likely because artists tend to draw them consistently
- Partially prune hair styles
- Leave key, defining style tags like `twintails`, `ponytail`, `short hair with long locks`, `twin braids`, etc.
- Prune exceedingly common tags like `bangs` / `sidelocks` / `eyebrows visible through hair` / `hair between eyes`, etc.
- Somewhat arbitrary, but I just don't think there's much value in them because they're ubiquitous and caption space is limited
- Prune length, except for images which differ from the character's usual length
- If you don't do this, it's more likely to get the hair length wrong when not prompted, which isn't a huge deal.
- Add `alternate hairstyle` and/or `alternate hair length` on applicable images, which can be used to more easily change styles while prompting
- **Kazusa**: `short hair, colored inner hair` -- while I would usually prune these, they're really her only defining hairstyle traits
- Fixup hair ornaments
- Prune generic `hair ornament` in favor of more specificity
- `hairclip` / `black headband` / `hair flower` / `hair ribbon`, etc.
- Consolidate tags that have color variants (`headband` >> `black headband`)
- **Kazusa**: `hairclip`
- Consolidate outfits
- Only tag an item when it is actually visible. If it is only barely visible along the edge of an image, keep in mind it may be cropped during bucketing.
- Danbooru's wiki entry for a character often provides a good list of tags for a character's entire outfit.
- **Kazusa outfits**:
- School Uniform
- `black choker`
- `hooded jacket`
- `black jacket`
- `green sailor collar`
- `pink neckerchief`
- `miniskirt`
- `pleated skirt`
- `white skirt`
- `black pantyhose`
- `sneakers`
- Fixup sleeves
- ie. `long sleeves` / `puffy long sleeves` / `detached sleeves`
- You only need one, but pick one and be consistent. If sleeves aren't tagged the AI tends to add them inappropriately (such as when prompting for sleeveless outfits or nudity)
- Fixup collars
- ie. `detached collar` / `collared shirt` / `choker` / etc.
- Same deal as sleeves, they tend to appear when unwanted if not consistently tagged according to actual visibility
- Fixup clothing state
- ie. `open jacket` / `open shirt` / `partially undressed` / `off shoulder`
- The tagger is generally good at this but it can help to double-check for weird outfits
- Tag expressions
- This is tedious and the autotagger doesn't help you out much, but tagging these can really help the AI nail multiple iconic expressions for a character
- Start by searching for images without one of these, and add them.
- `open mouth`
- `closed mouth`
- `parted lips`
- Sometimes applies with `open mouth`
- Then proceed through each image and add one of these
- `smile` / `light smile` / `:d` / `grin` (exposed teeth only)
- `:o` / `:<` / `expressionless` / `serious`
- `wavy mouth` / `embarrassed`
- `pout` / `:t` / `tsundere`
- `nervous` / `nervous smile`
- `flustered` / `swirly eyes` / `@_@`
- `surprised` / `o_o` / `wide-eyed`
- `upset` / `annoyed` / `frustrated` / `v-shaped eyebrows`
- `naughty face` / `seductive smile`
- `smug` / `:3` / `smirk`
- `yelling` / `frown`
- `eyes closed` / `one eye closed`
- WD1.4 almost always gets these two
- Tag camera angles/composition
- Most of these aren't very high value, but `from x` can be helpful.
- `cowboy shot`
- `upper body`
- `full body`
- `portrait`
- `feet out of frame`
- `cropped torso` / `cropped legs`
- `from side` / `from above` / `from below` / `from behind`
- Tag iconic poses, actions, or props
- Props need to show up often in training data for this to be worth it.
- `v` / `peace sign` / `standing on one leg`
- `holding dango` / `weapon case` / `fashion magazine`
- **Kazusa**
- `mouth hold`
- `eating`
- `macaron`
- Flip through each image and use Hydrus's "related tags" feature to quickly identify important tags that might be missing.
- This feature looks at other images with similar tags to provide suggestions. Good for spotting things you or the tagger might have missed.