|
# Rhythm Heaven Style LoRA for Stable Diffusion 1.5 |
|
Model is also on CivitAI: https://civitai.com/models/87254?modelVersionId=258514 |
|
## Model Details |
|
### Version 1 parameters: |
|
steps_per_image: 50 |
|
total_images: 49 |
|
total_steps: ~2400 |
|
training_model: Anything_V3 |
|
network_dim: 128 |
|
network_alpha: 128 |
|
network_train_on: both |
|
learning_rate: 1e-4 |
|
unet_lr: 0 |
|
text_encoder _lr: 5e-5 |
|
lr_scheduler: constant |
|
lr_scheduler_num_cycles: 1 |
|
lr_scheduler_power: 1 |
|
train_batch_size: 6 |
|
num_epochs: 6 |
|
mixed_precision: fp16 |
|
save_precision fp16 |
|
save_n_epochs_type: save_every_n_epochs |
|
save_n_epochs_type_value: 1 |
|
resolution: 512 |
|
max_token_length: 225 |
|
clip_skip: 2 |
|
additional_argument: --shuffle_caption --xformers |
|
training_hardware: Google Colab Free Tier: Nvidia Tesla T4 GPU |
|
training_time: ~45 minutes |
|
|
|
### Version 1.1 parameters: |
|
steps_per_image: 20 |
|
total_images: 122 (61 unique images, doubled amount by mirroring them) |
|
total_steps: 2440 |
|
training_model: Any_LoRA |
|
optimizer: AdamW |
|
network_dim: 128 |
|
network_alpha: 128 |
|
network_train_on: both |
|
learning_rate: 1e-4 |
|
unet_lr: 1e-4 |
|
text_encoder _lr: 5e-5 |
|
lr_scheduler: constant |
|
lr_scheduler_num_cycles: 1 |
|
lr_scheduler_power: 1 |
|
train_batch_size: 8 |
|
num_epochs: 6 |
|
mixed_precision: bf16 |
|
save_precision bf16 |
|
save_n_epochs_type: save_every_n_epochs |
|
save_n_epochs_type_value: 1 |
|
resolution: 768 |
|
max_token_length: 225 |
|
clip_skip: 2 |
|
additional_argument: --xformers |
|
training_hardware: RTX 3090 |
|
training_time: ~1.5 hours (I don't remember exactly) |
|
#### Version 1.1 Improvements: |
|
-**Better style consistency**: The model generates in a style closer to the Rhythm Heaven series much more consistently. |
|
1.0 generated a bit more of a detailed style though so if that's what you want you should use that one. |
|
-**Removed "rhythm_heaven" trigger**: Seems like a style trigger isn't really necessary, removing it just saves a bit of token length. |
|
-**Less unprompted black and white generations**: This one isn't as big but I manually added color to some of the training images to get more variety |
|
which consequently means you'll get less black and white generations. |
|
## Model Description |
|
Trained on humanoid characters from the Rhythm Heaven series (and some from Wario Ware) using AnyLoRA. |
|
Captions were done manually using booru tags. |
|
- **Model type:** Standard LoRA |
|
- **Finetuned from model:** Stable Diffusion 1.5 based models |
|
## Model Sources |
|
- **Repository:** [More Information Needed] |
|
- **CivitAI Link** https://civitai.com/models/87254?modelVersionId=258514 |
|
## Uses |
|
Used in conjunction with a booru based Stable Diffusion 1.5 model (ex. Any_LoRA) to emulate the style of the Rhythm_Heaven series. |
|
I recommend using it with a weight around 0.7 when prompting. Also, another reminder, this model was trained exclusively with booru tags so I'm not sure how |
|
well it'll work using blip captions. |