smashmaster's picture
Update README.md
feaf230 verified
metadata
license: gpl-3.0

Experiments on training 0.4B RWKV models around midi notation in a manner similar to this already existing midi model.

  • RWKV v4neo based, 20 epoch: Loss of about 2.7ish

image/png

  • WIP v6 pretrain that also sucks. Loss was around 2.3 to 2.5 but I'm guessing it ended up at 2.5, kind of sad but this can be used as a base I guess?

April 12, 2024 Update

  • Added v6 with different layer sizes.
  • Trained a base model on all of bread midi filtered by piano instrument only augumented 10 times. See the following wandb for training logs (note experimentation, finalish runs are used for the final file).
  • Used above model as the initial model and then trained on a combined dataset of Breadmidi + Los Angeles + Monster filtered by piano augumented 3x (wish I could have the storage space to do more). See the following wandb.