File size: 12,074 Bytes
1ad20a2 3919e78 4097cfc abbd085 1ad20a2 3919e78 5b2dc07 fa938d3 4097cfc 3919e78 4097cfc 3919e78 4097cfc 3919e78 4097cfc a7d67c8 bbd7189 f680768 4097cfc 6fd137f 4097cfc 3919e78 5065808 4097cfc f2c092d 5065808 bfe9086 f2c092d 5065808 4097cfc fa938d3 4097cfc 5065808 3919e78 4097cfc 3919e78 f680768 4097cfc |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 |
---
tags:
- generated_from_keras_callback
- music
datasets:
- juancopi81/mutopia_guitar_dataset
widget:
- text: PIECE_START TIME_SIGNATURE=4_4 BPM=90 TRACK_START INST=0 DENSITY=2 BAR_START
NOTE_ON=43
example_title: Time signature 4/4, BPM=90, NOTE=G2
base_model: gpt2
model-index:
- name: juancopi81/mutopia_guitar_mmm
results: []
---
# juancopi81/mutopia_guitar_mmm
Music generation could be approached similarly to language generation. There are many ways to represent music as text and then use a language model to create a model capable of music generation. For encoding MIDI files as text, I am using the excellent [implementation](https://github.com/AI-Guru/MMM-JSB) of Dr. Tristan Beheren of the paper: [MMM: Exploring Conditional Multi-Track Music Generation with the Transformer](https://arxiv.org/abs/2008.06048).
This model is a fine-tuned version of [gpt2](https://huggingface.co/gpt2) on the [Mutopia Guitar Dataset](https://huggingface.co/datasets/juancopi81/mutopia_guitar_dataset). Use the widget to generate your piece, and then use [this notebook](https://colab.research.google.com/drive/14vlJwCvDmNH6SFfVuYY0Y18qTbaHEJCY?usp=sharing) to listen to the results (work in progress).
I created the notebook as an adaptation of [the one created by Dr. Tristan Behrens](https://huggingface.co/TristanBehrens/js-fakes-4bars).
It achieves the following results on the evaluation set:
- Train Loss: 0.5365
- Validation Loss: 1.5482
## Model description
The model is GPT-2 loaded with the GPT2LMHeadModel architecture from Hugging Face. The context size is 256, and the vocabulary size is 588. The model uses a
`WhitespaceSplit` pre-tokenizer. The [tokenizer](https://huggingface.co/juancopi81/mutopia_guitar_dataset_tokenizer) is also in the Hugging Face hub.
## Intended uses & limitations
I built this model to learn more about how to use Hugging Face. I am implementing some of the parts of the [Hugging Face course](https://huggingface.co/course/chapter1/1) with a project that I find interesting.
The main intention of this model is educational. I am creating a [series of notebooks](https://github.com/juancopi81/MMM_Mutopia_Guitar) where I show every step of the process:
- Collecting the data
- Pre-processing the data
- Training a tokenizer from scratch
- Fine-tuning a GPT-2 model
- Building a Gradio app for the model
I trained the model using the free version of Colab with a small dataset. Right now, it is heavily overfitting. My idea is to have a more extensive dataset of Guitar Music from Latinoamerica to train a new model similar to the Mutopia Guitar Model, using more GPU resources.
## Training and evaluation data
I am training the model with [Mutopia Guitar Dataset](https://huggingface.co/datasets/juancopi81/mutopia_guitar_dataset). This dataset consists of the soloist guitar pieces of the [Mutopia Project](https://www.mutopiaproject.org/).
The dataset mainly contains guitar music from western classical composers, such as Sor, Aguado, Carcassi, and Giuliani.
For the first epochs of training, I transposed the notes by raising and lowering the pitches using the twelve semi-tones of an entire octave. Later, I trained the model without transposing the pieces so that generation shows better results of a real guitar piece.
### Training hyperparameters
<details>
<summary>Click to expand</summary>
The following hyperparameters were used during training (with transposition):
- optimizer: {'name': 'AdamWeightDecay', 'learning_rate': {'class_name': 'WarmUp', 'config': {'initial_learning_rate': 5e-07, 'decay_schedule_fn': {'class_name': 'PolynomialDecay', 'config': {'initial_learning_rate': 5e-07, 'decay_steps': 5726, 'end_learning_rate': 0.0, 'power': 1.0, 'cycle': False, 'name': None}, 'passive_serialization': True}, 'warmup_steps': 1000, 'power': 1.0, 'name': None}}, 'decay': 0.0, 'beta_1': 0.9, 'beta_2': 0.999, 'epsilon': 1e-08, 'amsgrad': False, 'weight_decay_rate': 0.01}
The following hyperparameters were used during training (without transposition - first round):
- optimizer: {'name': 'AdamWeightDecay', 'learning_rate': {'class_name': 'WarmUp', 'config': {'initial_learning_rate': 5e-07, 'decay_schedule_fn': {'class_name': 'PolynomialDecay', 'config': {'initial_learning_rate': 5e-07, 'decay_steps': 350, 'end_learning_rate': 0.0, 'power': 1.0, 'cycle': False, 'name': None}, '__passive_serialization__': True}, 'warmup_steps': 1000, 'power': 1.0, 'name': None}}, 'decay': 0.0, 'beta_1': 0.9, 'beta_2': 0.999, 'epsilon': 1e-08, 'amsgrad': False, 'weight_decay_rate': 0.01}
The following hyperparameters were used during training (without transposition - second round):
- optimizer: {'name': 'AdamWeightDecay', 'learning_rate': {'class_name': 'WarmUp', 'config': {'initial_learning_rate': 5e-07, 'decay_schedule_fn': {'class_name': 'PolynomialDecay', 'config': {'initial_learning_rate': 5e-07, 'decay_steps': 350, 'end_learning_rate': 0.0, 'power': 1.0, 'cycle': False, 'name': None}, '__passive_serialization__': True}, 'warmup_steps': 1000, 'power': 1.0, 'name': None}}, 'decay': 0.0, 'beta_1': 0.9, 'beta_2': 0.999, 'epsilon': 1e-08, 'amsgrad': False, 'weight_decay_rate': 0.01}
The following hyperparameters were used during training (without transposition, new tokenizer - third round):
- optimizer: {'name': 'AdamWeightDecay', 'learning_rate': {'class_name': 'WarmUp', 'config': {'initial_learning_rate': 5e-07, 'decay_schedule_fn': {'class_name': 'PolynomialDecay', 'config': {'initial_learning_rate': 5e-07, 'decay_steps': 350, 'end_learning_rate': 0.0, 'power': 1.0, 'cycle': False, 'name': None}, 'passive_serialization': True}, 'warmup_steps': 1000, 'power': 1.0, 'name': None}}, 'decay': 0.0, 'beta_1': 0.9, 'beta_2': 0.999, 'epsilon': 1e-08, 'amsgrad': False, 'weight_decay_rate': 0.01}
The following hyperparameters were used during training (without transposition, new tokenizer - fourth round):
- optimizer: {'name': 'AdamWeightDecay', 'learning_rate': {'class_name': 'WarmUp', 'config': {'initial_learning_rate': 5e-07, 'decay_schedule_fn': {'class_name': 'PolynomialDecay', 'config': {'initial_learning_rate': 5e-07, 'decay_steps': 350, 'end_learning_rate': 0.0, 'power': 1.0, 'cycle': False, 'name': None}, 'passive_serialization': True}, 'warmup_steps': 1000, 'power': 1.0, 'name': None}}, 'decay': 0.0, 'beta_1': 0.9, 'beta_2': 0.999, 'epsilon': 1e-08, 'amsgrad': False, 'weight_decay_rate': 0.01}
The following hyperparameters were used during training (without transposition, new tokenizer - fifth round):
- optimizer: {'name': 'AdamWeightDecay', 'learning_rate': {'class_name': 'WarmUp', 'config': {'initial_learning_rate': 5e-07, 'decay_schedule_fn': {'class_name': 'PolynomialDecay', 'config': {'initial_learning_rate': 5e-07, 'decay_steps': 350, 'end_learning_rate': 0.0, 'power': 1.0, 'cycle': False, 'name': None}, '__passive_serialization__': True}, 'warmup_steps': 1000, 'power': 1.0, 'name': None}}, 'decay': 0.0, 'beta_1': 0.9, 'beta_2': 0.999, 'epsilon': 1e-08, 'amsgrad': False, 'weight_decay_rate': 0.01}
The following hyperparameters were used during training (without transposition, new tokenizer - sixth round):
- optimizer: {'name': 'AdamWeightDecay', 'learning_rate': {'class_name': 'WarmUp', 'config': {'initial_learning_rate': 5e-07, 'decay_schedule_fn': {'class_name': 'PolynomialDecay', 'config': {'initial_learning_rate': 5e-07, 'decay_steps': 350, 'end_learning_rate': 0.0, 'power': 1.0, 'cycle': False, 'name': None}, 'passive_serialization': True}, 'warmup_steps': 1000, 'power': 1.0, 'name': None}}, 'decay': 0.0, 'beta_1': 0.9, 'beta_2': 0.999, 'epsilon': 1e-08, 'amsgrad': False, 'weight_decay_rate': 0.01}
The following hyperparameters were used during training (without transposition, new tokenizer - seventh round):
- optimizer: {'name': 'AdamWeightDecay', 'learning_rate': {'class_name': 'WarmUp', 'config': {'initial_learning_rate': 0.0005, 'decay_schedule_fn': {'class_name': 'PolynomialDecay', 'config': {'initial_learning_rate': 0.0005, 'decay_steps': 1025, 'end_learning_rate': 0.0, 'power': 1.0, 'cycle': False, 'name': None}, 'passive_serialization': True}, 'warmup_steps': 1000, 'power': 1.0, 'name': None}}, 'decay': 0.0, 'beta_1': 0.9, 'beta_2': 0.999, 'epsilon': 1e-08, 'amsgrad': False, 'weight_decay_rate': 0.01}
- training_precision: mixed_float16
</details>
### Training results
<details>
<summary>Click to expand</summary>
Using transposition:
| Train Loss | Validation Loss | Epoch |
|:----------:|:---------------:|:-----:|
| 1.0705 | 1.3590 | 0 |
| 0.8889 | 1.3702 | 1 |
| 0.7588 | 1.3974 | 2 |
| 0.7294 | 1.4813 | 3 |
| 0.6263 | 1.5263 | 4 |
| 0.5841 | 1.5263 | 5 |
| 0.5844 | 1.5263 | 6 |
| 0.5837 | 1.5346 | 7 |
| 0.5798 | 1.5411 | 8 |
| 0.5773 | 1.5440 | 9 |
Without transposition (first round):
| Train Loss | Validation Loss | Epoch |
|:----------:|:---------------:|:-----:|
| 0.5503 | 1.5436 | 0 |
| 0.5503 | 1.5425 | 1 |
| 0.5476 | 1.5425 | 2 |
| 0.5467 | 1.5425 | 3 |
| 0.5447 | 1.5431 | 4 |
| 0.5418 | 1.5447 | 5 |
| 0.5418 | 1.5451 | 6 |
| 0.5401 | 1.5472 | 7 |
| 0.5386 | 1.5479 | 8 |
| 0.5365 | 1.5482 | 9 |
Without transposition (second round):
| Train Loss | Validation Loss | Epoch |
|:----------:|:---------------:|:-----:|
| 0.5368 | 1.5482 | 0 |
| 0.5355 | 1.5480 | 1 |
| 0.5326 | 1.5488 | 2 |
| 0.5363 | 1.5493 | 3 |
| 0.5346 | 1.5488 | 4 |
| 0.5329 | 1.5502 | 5 |
| 0.5329 | 1.5514 | 6 |
| 0.5308 | 1.5514 | 7 |
| 0.5292 | 1.5536 | 8 |
| 0.5272 | 1.5543 | 9 |
Without transposition (third round - new tokenizer):
| Train Loss | Validation Loss | Epoch |
|:----------:|:---------------:|:-----:|
| 6.1361 | 6.4569 | 0 |
| 5.6383 | 5.8249 | 1 |
| 4.9125 | 4.8956 | 2 |
| 4.2013 | 4.2778 | 3 |
| 3.8665 | 4.0330 | 4 |
| 3.7106 | 3.8956 | 5 |
| 3.6041 | 3.7995 | 6 |
| 3.5301 | 3.7485 | 7 |
| 3.4973 | 3.7323 | 8 |
| 3.4909 | 3.7323 | 9 |
Without transposition (fourth round - new tokenizer):
| Train Loss | Validation Loss | Epoch |
|:----------:|:---------------:|:-----:|
| 3.4879 | 3.7206 | 0 |
| 3.4667 | 3.6874 | 1 |
| 3.4229 | 3.6373 | 2 |
| 3.3680 | 3.5751 | 3 |
| 3.2998 | 3.5026 | 4 |
| 3.2208 | 3.4240 | 5 |
| 3.1385 | 3.3397 | 6 |
| 3.0580 | 3.2587 | 7 |
| 2.9949 | 3.2118 | 8 |
| 2.9646 | 3.1958 | 9 |
Without transposition (fifth round - new tokenizer):
| Train Loss | Validation Loss | Epoch |
|:----------:|:---------------:|:-----:|
| 2.9562 | 3.1902 | 0 |
| 2.9457 | 3.1751 | 1 |
| 2.9266 | 3.1512 | 2 |
| 2.9039 | 3.1176 | 3 |
| 2.8705 | 3.0775 | 4 |
| 2.8291 | 3.0295 | 5 |
| 2.7872 | 2.9811 | 6 |
| 2.7394 | 2.9321 | 7 |
| 2.6996 | 2.9023 | 8 |
| 2.6819 | 2.8927 | 9 |
Without transposition (sixth round - new tokenizer):
| Train Loss | Validation Loss | Epoch |
|:----------:|:---------------:|:-----:|
| 2.6769 | 2.8894 | 0 |
| 2.6719 | 2.8791 | 1 |
| 2.6612 | 2.8638 | 2 |
| 2.6465 | 2.8439 | 3 |
| 2.6242 | 2.8174 | 4 |
| 2.6006 | 2.7877 | 5 |
| 2.5679 | 2.7554 | 6 |
| 2.5387 | 2.7223 | 7 |
| 2.5115 | 2.7029 | 8 |
| 2.5011 | 2.6970 | 9 |
Without transposition (seventh round - new tokenizer):
| Train Loss | Validation Loss | Epoch |
|:----------:|:---------------:|:-----:|
| 2.2881 | 2.2059 | 0 |
| 1.7702 | 1.8533 | 1 |
| 1.4625 | 1.6948 | 2 |
| 1.2876 | 1.6865 | 3 |
| 1.1926 | 1.6414 | 4 |
| 1.1329 | 1.6360 | 5 |
| 1.1069 | 1.6448 | 6 |
| 1.0408 | 1.6207 | 7 |
| 0.8939 | 1.5837 | 8 |
| 0.7265 | 1.5901 | 9 |
| 0.5902 | 1.6261 | 10 |
| 0.4489 | 1.7007 | 11 |
| 0.3223 | 1.7940 | 12 |
| 0.2158 | 1.9032 | 13 |
| 0.1448 | 1.9892 | 14 |
</details>
### Framework versions
- Transformers 4.22.1
- TensorFlow 2.8.2
- Datasets 2.5.1
- Tokenizers 0.12.1 |