base_model: gpt2
tags:
- generated_from_trainer
model-index:
- name: midi_model_3
results: []
midi_model_3
This model is a fine-tuned version of gpt2 on the js-fakes-4bars dataset. It achieves the following results on the evaluation set:
- Loss: 0.5542
Model description
This model generates encoded midi that follows the format of jsfakes chorales. This representation enables the ability to train traditional language models on midi data. Also see Magenta here.
Intended uses & limitations
For generating basic encoded midi in the jsfakes style, as a proof of concept. This model is very limited, and shows the ability to train and host this kind of model completely free.
Training and evaluation data
This model is trained on the js-fakes-4bars dataset, which is a tokenized version of the JS-Fakes dataset by Omar Peracha.
- Link to the original datset here
- Link to the tokenized dataset here
- Training set is 4.02k rows
- Test set is 463 rows
The data encodes midi information as encoded text. Here are some examples of what the data looks like:
- PIECE_START (The start of the midi.)
- PIECE_END (The end of the midi.)
- STYLE=JSFAKES (A style tag, which is unused in this dataset.)
- GENRE=JSFAKES (A genre tag, also unused in this dataset.)
- TRACK_START (The start of an instrument's track.)
- TRACK_END (The end of an instrument's track.)
- INST=48 (The instrument the notes will belong to.)
- BAR_START (The start of a musical measure.)
- BAR_END (the end of a musical measure.)
- NOTE_ON=57 (Specifies the note that will start.)
- NOTE_OFF=57 (Specifies the note that will end.)
- TIME_DELTA=4 (How long the note plays for.)
Training procedure
Training was done through Google Colab's free tier, using a single 15GB Tesla T4 GPU. Training was logged through Weights and Biases. A link to the full training notebook can be found [here] (https://colab.research.google.com/drive/1uvv-ChthIrmEJMBOVyL7mTm4dcf4QZq7#scrollTo=34kpyWSnaJE1)
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0005
- train_batch_size: 4
- eval_batch_size: 2
- seed: 1
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: cosine
- lr_scheduler_warmup_ratio: 0.01
- num_epochs: 10
Training Statistics
- Total training runtime: 787 seconds (around 13 minutes)
- Training samples per second: 45.91
- Training steps per second: 11.484
- Average GPU watt usage: 66W
- Average GPU temperature: 77C
Training results
Training Loss | Epoch | Step | Validation Loss |
---|---|---|---|
0.8047 | 0.33 | 300 | 0.7969 |
0.7924 | 0.66 | 600 | 0.7735 |
0.7758 | 1.0 | 900 | 0.7528 |
0.75 | 1.33 | 1200 | 0.7436 |
0.7432 | 1.66 | 1500 | 0.7277 |
0.7361 | 1.99 | 1800 | 0.7175 |
0.7121 | 2.32 | 2100 | 0.7025 |
0.708 | 2.65 | 2400 | 0.6861 |
0.6971 | 2.99 | 2700 | 0.6781 |
0.6777 | 3.32 | 3000 | 0.6718 |
0.6733 | 3.65 | 3300 | 0.6578 |
0.6643 | 3.98 | 3600 | 0.6500 |
0.6422 | 4.31 | 3900 | 0.6423 |
0.6401 | 4.65 | 4200 | 0.6330 |
0.6302 | 4.98 | 4500 | 0.6228 |
0.6103 | 5.31 | 4800 | 0.6148 |
0.6066 | 5.64 | 5100 | 0.6069 |
0.5995 | 5.97 | 5400 | 0.5979 |
0.5724 | 6.31 | 5700 | 0.5915 |
0.5772 | 6.64 | 6000 | 0.5870 |
0.5677 | 6.97 | 6300 | 0.5771 |
0.5491 | 7.3 | 6600 | 0.5740 |
0.5433 | 7.63 | 6900 | 0.5675 |
0.5384 | 7.96 | 7200 | 0.5630 |
0.5245 | 8.3 | 7500 | 0.5611 |
0.5206 | 8.63 | 7800 | 0.5578 |
0.5198 | 8.96 | 8100 | 0.5553 |
0.5141 | 9.29 | 8400 | 0.5544 |
0.5091 | 9.62 | 8700 | 0.5543 |
0.5096 | 9.96 | 9000 | 0.5542 |
Framework versions
- Transformers 4.35.2
- Pytorch 2.1.0+cu118
- Datasets 2.15.0
- Tokenizers 0.15.0