File size: 3,017 Bytes

aa2ce9a
 
f6257dc
aa2ce9a
061e88f
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
aa2ce9a
 
061e88f
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
aa2ce9a
 
 
 
 
 
 
 
 
 
 
 
f6257dc
aa2ce9a
f6257dc
aa2ce9a

# Rhythm Heaven Style LoRA for Stable Diffusion 1.5
    Model is also on CivitAI: https://civitai.com/models/87254?modelVersionId=258514
## Model Details
### Version 1 parameters:
    steps_per_image: 50
    total_images: 49
    total_steps: ~2400
    training_model: Anything_V3
    network_dim: 128
    network_alpha: 128
    network_train_on: both
    learning_rate: 1e-4
    unet_lr: 0
    text_encoder _lr: 5e-5
    lr_scheduler: constant
    lr_scheduler_num_cycles: 1
    lr_scheduler_power: 1
    train_batch_size: 6
    num_epochs: 6
    mixed_precision: fp16
    save_precision fp16
    save_n_epochs_type: save_every_n_epochs
    save_n_epochs_type_value: 1
    resolution: 512
    max_token_length: 225
    clip_skip: 2
    additional_argument: --shuffle_caption --xformers
    training_hardware: Google Colab Free Tier: Nvidia Tesla T4 GPU
    training_time: ~45 minutes

### Version 1.1 parameters:
    steps_per_image: 20
    total_images: 122 (61 unique images, doubled amount by mirroring them)
    total_steps: 2440
    training_model: Any_LoRA
    optimizer: AdamW
    network_dim: 128
    network_alpha: 128
    network_train_on: both
    learning_rate: 1e-4
    unet_lr: 1e-4
    text_encoder _lr: 5e-5
    lr_scheduler: constant
    lr_scheduler_num_cycles: 1
    lr_scheduler_power: 1
    train_batch_size: 8
    num_epochs: 6
    mixed_precision: bf16
    save_precision bf16
    save_n_epochs_type: save_every_n_epochs
    save_n_epochs_type_value: 1
    resolution: 768
    max_token_length: 225
    clip_skip: 2
    additional_argument: --xformers
    training_hardware: RTX 3090
    training_time: ~1.5 hours (I don't remember exactly)
#### Version 1.1 Improvements:
  -**Better style consistency**: The model generates in a style closer to the Rhythm Heaven series much more consistently.
  1.0 generated a bit more of a detailed style though so if that's what you want you should use that one.
  -**Removed "rhythm_heaven" trigger**: Seems like a style trigger isn't really necessary, removing it just saves a bit of token length.
  -**Less unprompted black and white generations**: This one isn't as big but I manually added color to some of the training images to get more variety
  which consequently means you'll get less black and white generations.
## Model Description
Trained on humanoid characters from the Rhythm Heaven series (and some from Wario Ware) using AnyLoRA.
Captions were done manually using booru tags.
- **Model type:** Standard LoRA 
- **Finetuned from model:** Stable Diffusion 1.5 based models
## Model Sources
- **Repository:** [More Information Needed]
- **CivitAI Link** https://civitai.com/models/87254?modelVersionId=258514
## Uses
Used in conjunction with a booru based Stable Diffusion 1.5 model (ex. Any_LoRA) to emulate the style of the Rhythm_Heaven series.
I recommend using it with a weight around 0.7 when prompting. Also, another reminder, this model was trained exclusively with booru tags so I'm not sure how
well it'll work using blip captions.