Metacreation/GigaMIDI
Viewer • Updated • 3.44M • 3.52k • 43
MIDI-GPT is a GPT-2 transformer for symbolic music generation trained on the GigaMIDI dataset. It supports bar-level infill, autoregressive multi-track generation, and attribute-conditioned generation (note density, polyphony, note duration).
Paper: MIDI-GPT: A Controllable Language Model for Symbolic Music Performance Generation (AAAI 2025)
GitHub: Metacreation/MIDI-GPT
PyPI: midigpt
| File | Context (bars) | Infill | Bar masking | Microtiming | Attributes |
|---|---|---|---|---|---|
yellow.pt |
4, 8 | yes | no | no | density, polyphony, note duration |
pip install "midigpt[inference]"
from midigpt import Score
from midigpt.inference.engine import InferenceEngine
from midigpt.inference.config import GenerationRequest, InferenceConfig, TrackPrompt
# Download and cache the model automatically
engine = InferenceEngine.from_pretrained("yellow")
# Load a MIDI file
score = Score.from_midi("my_song.mid")
# Infill bars 4–7 on track 0 given surrounding context
request = GenerationRequest(
tracks=[
TrackPrompt(id=0, bars=list(range(4, 8))),
],
config=InferenceConfig(model_dim=8),
)
session = engine.session(score, request)
result = session.run()
result.to_midi("output.mid")
Models were trained on GigaMIDI v2.0.0 using the midigpt training pipeline
with PyTorch Lightning. Training configs and the preprocessing pipeline are
available in the GitHub repository.
@misc{pasquier2025midigptcontrollablegenerativemodel,
title={MIDI-GPT: A Controllable Generative Model for Computer-Assisted Multitrack Music Composition},
author={Philippe Pasquier and Jeff Ens and Nathan Fradet and Paul Triana and Davide Rizzotti and Jean-Baptiste Rolland and Maryam Safi},
year={2025},
eprint={2501.17011},
archivePrefix={arXiv},
primaryClass={cs.SD},
url={https://arxiv.org/abs/2501.17011},
}
Creative Commons Attribution-NonCommercial 4.0 International (CC-BY-NC-4.0). Copyright (c) 2026 Metacreation Lab.