DiCoSe: Improving Music Source Separation with Diffusion and Consistency Refinement

Pre-trained checkpoints for "Improving Music Source Separation with Diffusion and Consistency Refinement".

This repo hosts checkpoints for two experimental tracks described in the paper:

  1. A custom U-Net separator trained on Slakh2100.
  2. A BS-RoFormer separator (backbone from Music-Source-Separation-Training) trained on MUSDB18-HQ.

For each track, three checkpoints are provided, corresponding to the three stages of the method: a Deterministic separator, a Diffusion refinement model trained on top of it, and a Consistency-Distilled (CD) model distilled from the diffusion model for fast (1-2 step) inference.

Files

File Track Stage SDR (dB, avg across stems)
Deterministic_model_unet/model.ckpt U-Net / Slakh2100 Deterministic 10.89
diffusion_model_unet/model.ckpt U-Net / Slakh2100 Diffusion 11.34
CD_unet/model.ckpt U-Net / Slakh2100 Consistency-Distilled 11.42 (T=1) → 11.95 (T=4)
Deterministic_model_MSST_bs_roformer/model.ckpt BS-RoFormer / MUSDB18 Deterministic 9.84
diffusion_model_MSST_bs_roformer/model.ckpt BS-RoFormer / MUSDB18 Diffusion 10.34
CD_MSST_bs_roformer/model.ckpt BS-RoFormer / MUSDB18 Consistency-Distilled 10.41 (T=1) → 10.40 (T=2)

SDR is the median-over-1s-chunks SDR (via museval), averaged across stems on the respective test set, as reported in the paper. The Consistency-Distilled (CD) checkpoints are a single model evaluated at different numbers of inference steps (T); more steps generally improve quality further.

Usage

See the GitHub repo for the download script, environment setup, and eval configs that load these checkpoints. Training/eval code for the BS-RoFormer track is coming soon; checkpoints are published now for reference.

Citation

@misc{karchkhadze2024improvingsourceextractiondiffusion,
  title={Improving Music Source Separation with Diffusion and Consistency Refinement},
  author={Tornike Karchkhadze and Mohammad Rasool Izadi and Shuo Zhang and Shlomo Dubnov},
  year={2024},
  eprint={2412.06965},
  archivePrefix={arXiv},
  primaryClass={cs.SD},
  url={https://arxiv.org/abs/2412.06965},
}

License

MIT

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Paper for karchkha/DiCoSe