File size: 2,289 Bytes
15f84e6 0d9c7cf |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 |
---
license: apache-2.0
---
# Quad Music Transformer
## SOTA quality fast music transformer with symmetrical quad MIDI notes encoding
![Quad-Music-Transformer-Artwork (7)](https://github.com/asigalov61/Quad-Music-Transformer/assets/56325539/9d69c44f-1b35-44b0-b78d-84e53ec30e16)
***
## Original Version
[![Open In Colab][colab-badge]][colab-notebook1]
[colab-notebook1]: <https://colab.research.google.com/github/asigalov61/Quad-Music-Transformer/blob/main/Quad_Music_Transformer.ipynb>
[colab-badge]: <https://colab.research.google.com/assets/colab-badge.svg>
### Features demonstration
***
## Composer Version
[![Open In Colab][colab-badge]][colab-notebook2]
[colab-notebook2]: <https://colab.research.google.com/github/asigalov61/Quad-Music-Transformer/blob/main/Quad_Music_Transformer_Composer.ipynb>
[colab-badge]: <https://colab.research.google.com/assets/colab-badge.svg>
### MuseNet-style workflow for endless supervised continuation generation
***
## Bulk Generator Version
[![Open In Colab][colab-badge]][colab-notebook3]
[colab-notebook3]: <https://colab.research.google.com/github/asigalov61/Quad-Music-Transformer/blob/main/Quad_Music_Transformer_Bulk_Generator.ipynb>
[colab-badge]: <https://colab.research.google.com/assets/colab-badge.svg>
### Bulk improvs and continuations generation
***
## Technical notes
### SOTA quality was achieved by using the following specific techniques:
### 1) Quality source MIDI dataset (quality over quantity)
### 2) MIDI dataset augmentation by time (x2) and pitches (x3)
### 3) Timings normalization, quantization and compression (128)
### 4) Larger model embed size (2048) with less layers (16) and heads (16)
### 5) Training longer since the MIDI dataset is small (2 full epochs)
### 6) Using MIDI instruments families (16) instead of full MIDI instruments range (128)
### 7) Using symmetrical quad MIDI notes encoding
### 8) 8k sequence length so that the model can learn long-term music scructure
### 9) Using fp16 precision so that the model is sufficiently fast with low memory footprint
### 10) Hex (16) MIDI velocity range to avoid velocity overfitting while preserving velocity details
### 11) Chords sorting by instruments families (L-H) and by pitch (H-L)
***
### Project Los Angeles
### Tegridy Code 2024
|