metadata

base_model: gpt2
tags:
  - generated_from_trainer
model-index:
  - name: midi_model_3
    results: []

midi_model_3

This model is a fine-tuned version of gpt2 on the js-fakes-4bars dataset. It achieves the following results on the evaluation set:

Loss: 0.5542

Model description

This model generates encoded midi that follows the format of jsfakes chorales. This representation enables the ability to train traditional language models on midi data. Also see Magenta here.

Intended uses & limitations

For generating basic encoded midi in the jsfakes style, as a proof of concept. This model is very limited, and shows the ability to train and host this kind of model completely free.

Training and evaluation data

This model is trained on the js-fakes-4bars dataset, which is a tokenized version of the JS-Fakes dataset by Omar Peracha.

Link to the original datset here
Link to the tokenized dataset here
Training set is 4.02k rows
Test set is 463 rows

The data encodes midi information as encoded text. Here are some examples of what the data looks like:

PIECE_START (The start of the midi.)
PIECE_END (The end of the midi.)
STYLE=JSFAKES (A style tag, which is unused in this dataset.)
GENRE=JSFAKES (A genre tag, also unused in this dataset.)
TRACK_START (The start of an instrument's track.)
TRACK_END (The end of an instrument's track.)
INST=48 (The instrument the notes will belong to.)
BAR_START (The start of a musical measure.)
BAR_END (the end of a musical measure.)
NOTE_ON=57 (Specifies the note that will start.)
NOTE_OFF=57 (Specifies the note that will end.)
TIME_DELTA=4 (How long the note plays for.)

Training procedure

Training was done through Google Colab's free tier, using a single 15GB Tesla T4 GPU. Training was logged through Weights and Biases. A link to the full training notebook can be found [here] (https://colab.research.google.com/drive/1uvv-ChthIrmEJMBOVyL7mTm4dcf4QZq7#scrollTo=34kpyWSnaJE1)

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0005
train_batch_size: 4
eval_batch_size: 2
seed: 1
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: cosine
lr_scheduler_warmup_ratio: 0.01
num_epochs: 10

Training Statistics

Total training runtime: 787 seconds (around 13 minutes)
Training samples per second: 45.91
Training steps per second: 11.484
Average GPU watt usage: 66W
Average GPU temperature: 77C

Training results

Training Loss	Epoch	Step	Validation Loss
0.8047	0.33	300	0.7969
0.7924	0.66	600	0.7735
0.7758	1.0	900	0.7528
0.75	1.33	1200	0.7436
0.7432	1.66	1500	0.7277
0.7361	1.99	1800	0.7175
0.7121	2.32	2100	0.7025
0.708	2.65	2400	0.6861
0.6971	2.99	2700	0.6781
0.6777	3.32	3000	0.6718
0.6733	3.65	3300	0.6578
0.6643	3.98	3600	0.6500
0.6422	4.31	3900	0.6423
0.6401	4.65	4200	0.6330
0.6302	4.98	4500	0.6228
0.6103	5.31	4800	0.6148
0.6066	5.64	5100	0.6069
0.5995	5.97	5400	0.5979
0.5724	6.31	5700	0.5915
0.5772	6.64	6000	0.5870
0.5677	6.97	6300	0.5771
0.5491	7.3	6600	0.5740
0.5433	7.63	6900	0.5675
0.5384	7.96	7200	0.5630
0.5245	8.3	7500	0.5611
0.5206	8.63	7800	0.5578
0.5198	8.96	8100	0.5553
0.5141	9.29	8400	0.5544
0.5091	9.62	8700	0.5543
0.5096	9.96	9000	0.5542

Framework versions

Transformers 4.35.2
Pytorch 2.1.0+cu118
Datasets 2.15.0
Tokenizers 0.15.0