MINGUS β€” Weimar Jazz Database

Trained checkpoints for MINGUS (Melodic Improvisation Neural Generator Using Seq2seq), retrained from scratch on the Weimar Jazz Database.

Used by the comparison pipeline of the diploma research kudrmax/jazz-generation-research alongside CMT and BebopNet.

Two ablation runs

We ran MINGUS on WjazzD with two conditioning configurations to investigate whether the paper-reported optimal carries over to our test set (40 solos β€” the cross-model intersection from wjazzd_split.json, not the 86-solo random 20% used in the paper).

run pitch cond duration cond pitch test_ppl pitch test_acc duration test_ppl duration test_acc
paper/ (default) I-C-NC-B-BE-O I-C-NC-B-BE-O 13.08 0.131 4.087 0.344
paper-optimal/ D-C-B-BE-O B-BE-O 13.53 0.120 4.048 0.345
Madaghiele 2021 (paper) D-C-B-BE-O B-BE-O 11.01 0.163 4.140 0.323

Observations. Duration model in both our runs reproduces / slightly beats the paper number (4.05–4.09 vs 4.14), confirming our pipeline is correct. The pitch model lags by ~20% on test perplexity for both conditioning choices β€” this is most plausibly explained by our test set being a different (and apparently harder) 40-solo subset than the paper's 86-solo random split, since the conditioning swap that's supposed to help didn't help on our set. We're keeping both runs for reference.

Files

paper/                              # default conditioning (I-C-NC-B-BE-O for both)
β”œβ”€β”€ pitchModel/MINGUS COND I-C-NC-B-BE-O Epochs 10.pt
β”œβ”€β”€ durationModel/MINGUS COND I-C-NC-B-BE-O Epochs 10.pt
β”œβ”€β”€ pitch_state.pt                  # resume checkpoint
└── duration_state.pt

paper-optimal/                      # paper Β§3.2 optimal: pitch=D-C-B-BE-O, duration=B-BE-O
β”œβ”€β”€ pitchModel/MINGUS COND D-C-B-BE-O Epochs 10.pt
β”œβ”€β”€ durationModel/MINGUS COND B-BE-O Epochs 10.pt
β”œβ”€β”€ pitch_state.pt
└── duration_state.pt

The two final .pt files in pitchModel/ and durationModel/ are what MINGUS's authorial C_generate/generate.py and our GeneratorMingus wrapper expect at those exact paths (with spaces in the filename, as published by the authors).

The *_state.pt files are checkpoints produced by our resume-aware B_train/train.py patch β€” useful only if you want to continue training from epoch 10 instead of starting fresh.

Common training setup

  • Architecture: two parallel Transformer phases (pitch model β†’ duration model). 4 layers / 4 heads / d_model=200 each.
  • Dataset: Weimar Jazz Database, 4/4 solos only. Split (train/val/test): 340 / 42 / 40. Test = canonical cross-model intersection (details in source repo).
  • Training: 10 epochs per phase, BPTT 35, batch 20, SGD+momentum, StepLR.

Download

pip install -U huggingface_hub

# paper-optimal run (recommended for comparison-pipeline use)
hf download maxkudryashov/mingus-1 \
  "paper-optimal/pitchModel/MINGUS COND D-C-B-BE-O Epochs 10.pt" \
  "paper-optimal/durationModel/MINGUS COND B-BE-O Epochs 10.pt" \
  --local-dir result

# default-conditioning run
hf download maxkudryashov/mingus-1 \
  "paper/pitchModel/MINGUS COND I-C-NC-B-BE-O Epochs 10.pt" \
  "paper/durationModel/MINGUS COND I-C-NC-B-BE-O Epochs 10.pt" \
  --local-dir result

Reproducibility

Trained via the Colab notebook at models/MINGUS/training/colab_train.ipynb in our MINGUS fork. Notebook is idempotent: rerunning resumes from the last completed epoch via <work_dir>/<phase>_state.pt.

paper/ was trained on Colab CPU runtime (628s pitch + 484s duration β‰ˆ 18 min). paper-optimal/ was trained on Colab A100 GPU (53s pitch + 51s duration β‰ˆ 2 min).

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support