khanon
/

lora-training

Model card Files Files and versions Community

lora-training / kazusa /tagging methodology.md

khanon

updates preview images

4b7c79c 10 months ago

preview code

raw

history blame

No virus

7.39 kB

	# Tagging methodology for Kazusa (blue archive)

	## README / Intro
	Since I've seen a few people share this already I'll provide this disclaimer.

	This is not really intended to be a guide, it's just a log/checklist of my process, for my own benefit, since I repeat this for a lot of LoRAs and I got tired of winging it every single time. I've put only the slightest amount of effort into making it accessible to others.

	I don't claim that any or all of these optimal, nor can I confidently put them forth as recommendations. They're literally just a record of the steps I follow while tagging, gradually developed after ~16 characters using some version of the below process.

	Still, I can at least point to my pre-Koharu LoRAs (which used pure WD1.4 tags) and the ones that came after (where I started heavily editing tags) and see a steady progression in quality and prompting flexibility despite using mostly the same training settings for each one.

	Yes, it takes forever to do all of this shit. No, I don't recommend it unless you're extremely autistic; raw WD1.4 tags are probably good enough for most people. If you intend to do this for more than a few characters, I strongly recommend learning [Hydrus](https://hydrusnetwork.github.io/hydrus/introduction.html) it makes all of this way, way less tedious compared to doing it with crappier tools.

	---

	## Prep

	- Scraped `1girl kazusa_(blue_archive) order:popularity` from sancom, curated for quality, then exported from Hydrus to feed into WD1.4 Tagger.
	- Kazusa has a shitload of good art so I had to be very picky to get down to 280 images, which is still a lot. In hindsight I think huge datasets aren't really a problem; they let you train for longer without overfitting.
	- Gelbooru is probably fine too. Danbooru sucks for ロリ unless you have Gold.
	- I also got a few newer images from pixiv, don't remember which ones.
	- Exported final images from Hydrus to feed into WD1.4 Tagger
	- Auto-tagged with WD1.4 Swinv2 at 0.25 confidence
	- Reimported images+tags into Hydrus using the .txt sidecar feature. I strongly recommend putting WD1.4 tags in a separate tag domain so they aren't mixed in with shit scraped from boorus.

	## Tagging

	- Tag unique features
	- `halo` / `demon horns` / `low wings`
	- Remove when not present or out of view. WD1.4 likes putting `halo` even on images where no halo is visible.
	- Kazusa: `halo` / `animal ears`
	- Pruned `extra ears` as it seems redundant and intrinsic to the character.
	- Tag outfit variants with a single master tag
	- Kazusa:
	- Uniform: `school uniform` / `black jacket`
	- Sometimes the jacket appears without anything else, which was not tagged `school uniform`
	- Non-canon costumes
	- Add `alternate costume`
	- Nudity (WD1.4 usually does this accurately)
	- `nude` / `completely nude`
	- Prune eye colors
	- Keep tags which describe unusual eye features (`multicolored eyes`, `heterochromia`, `slit pupils`) as they can otherwise be too subtle and inconsistently drawn for the AI to notice
	- Prune hair colors
	- This includes `two-toned hair`, `gradiant hair`, etc. The AI learns all of these very consistently without the tags, likely because artists tend to draw them consistently
	- Partially prune hair styles
	- Leave key, defining style tags like `twintails`, `ponytail`, `short hair with long locks`, `twin braids`, etc.
	- Prune exceedingly common tags like `bangs` / `sidelocks` / `eyebrows visible through hair` / `hair between eyes`, etc.
	- Somewhat arbitrary, but I just don't think there's much value in them because they're ubiquitous and caption space is limited
	- Prune length, except for images which differ from the character's usual length
	- If you don't do this, it's more likely to get the hair length wrong when not prompted, which isn't a huge deal.
	- Add `alternate hairstyle` and/or `alternate hair length` on applicable images, which can be used to more easily change styles while prompting
	- Kazusa: `short hair, colored inner hair` -- while I would usually prune these, they're really her only defining hairstyle traits
	- Fixup hair ornaments
	- Prune generic `hair ornament` in favor of more specificity
	- `hairclip` / `black headband` / `hair flower` / `hair ribbon`, etc.
	- Consolidate tags that have color variants (`headband` >> `black headband`)
	- Kazusa: `hairclip`
	- Consolidate outfits
	- Only tag an item when it is actually visible. If it is only barely visible along the edge of an image, keep in mind it may be cropped during bucketing.
	- Danbooru's wiki entry for a character often provides a good list of tags for a character's entire outfit.
	- Kazusa outfits:
	- School Uniform
	- `black choker`
	- `hooded jacket`
	- `black jacket`
	- `green sailor collar`
	- `pink neckerchief`
	- `miniskirt`
	- `pleated skirt`
	- `white skirt`
	- `black pantyhose`
	- `sneakers`
	- Fixup sleeves
	- ie. `long sleeves` / `puffy long sleeves` / `detached sleeves`
	- You only need one, but pick one and be consistent. If sleeves aren't tagged the AI tends to add them inappropriately (such as when prompting for sleeveless outfits or nudity)
	- Fixup collars
	- ie. `detached collar` / `collared shirt` / `choker` / etc.
	- Same deal as sleeves, they tend to appear when unwanted if not consistently tagged according to actual visibility
	- Fixup clothing state
	- ie. `open jacket` / `open shirt` / `partially undressed` / `off shoulder`
	- The tagger is generally good at this but it can help to double-check for weird outfits
	- Tag expressions
	- This is tedious and the autotagger doesn't help you out much, but tagging these can really help the AI nail multiple iconic expressions for a character
	- Start by searching for images without one of these, and add them.
	- `open mouth`
	- `closed mouth`
	- `parted lips`
	- Sometimes applies with `open mouth`
	- Then proceed through each image and add one of these
	- `smile` / `light smile` / `:d` / `grin` (exposed teeth only)
	- `:o` / `:<` / `expressionless` / `serious`
	- `wavy mouth` / `embarrassed`
	- `pout` / `:t` / `tsundere`
	- `nervous` / `nervous smile`
	- `flustered` / `swirly eyes` / `@_@`
	- `surprised` / `o_o` / `wide-eyed`
	- `upset` / `annoyed` / `frustrated` / `v-shaped eyebrows`
	- `naughty face` / `seductive smile`
	- `smug` / `:3` / `smirk`
	- `yelling` / `frown`
	- `eyes closed` / `one eye closed`
	- WD1.4 almost always gets these two
	- Tag camera angles/composition
	- Most of these aren't very high value, but `from x` can be helpful.
	- `cowboy shot`
	- `upper body`
	- `full body`
	- `portrait`
	- `feet out of frame`
	- `cropped torso` / `cropped legs`
	- `from side` / `from above` / `from below` / `from behind`
	- Tag iconic poses, actions, or props
	- Props need to show up often in training data for this to be worth it.
	- `v` / `peace sign` / `standing on one leg`
	- `holding dango` / `weapon case` / `fashion magazine`
	- Kazusa
	- `mouth hold`
	- `eating`
	- `macaron`
	- Flip through each image and use Hydrus's "related tags" feature to quickly identify important tags that might be missing.
	- This feature looks at other images with similar tags to provide suggestions. Good for spotting things you or the tagger might have missed.