smashmaster's picture
Update README.md
feaf230 verified
---
license: gpl-3.0
---
Experiments on training 0.4B RWKV models around midi notation in a manner similar to [this already existing](https://huggingface.co/brianflakes/rwkv-midi-piano) midi model.
* RWKV v4neo based, 20 epoch: Loss of about 2.7ish
![image/png](https://cdn-uploads.huggingface.co/production/uploads/6352287eef8786433ecdb736/zPg9n76e40lEl-HzF7TvF.png)
* WIP v6 pretrain that also sucks. Loss was around 2.3 to 2.5 but I'm guessing it ended up at 2.5, kind of sad but this can be used as a base I guess?
## April 12, 2024 Update
* Added v6 with different layer sizes.
* Trained a base model on all of bread midi filtered by piano instrument only augumented 10 times. See the following [wandb](https://wandb.ai/smashmaster0045/Generic%20RWKV-6%20Piano%20Midi%20Model%20Base%20L29%20Augumented%20Data%20Test%20Bread%20Only/workspace) for training logs (note experimentation, finalish runs are used for the final file).
* Used above model as the initial model and then trained on a combined dataset of Breadmidi + Los Angeles + Monster filtered by piano augumented 3x (wish I could have the storage space to do more). See the following [wandb](https://wandb.ai/smashmaster0045/Generic%20RWKV-6%20Piano%20Midi%20Model%20Base%20L29%20Augumented%20Data%20Test%20bread%20to%20diverse%20transfer/workspace).