Currently it's an experimental model!

How to use the model?

Try it with ZFTurbo's Music-Source-Separation-Training

Description

Recently, I attempted to train a model for separating male and female voices in choir singing, and the results were quite good, far exceeding my expectations. However, due to the lack of a certain degree of universality in the training and validation data (all the training and validation data used were Chinese songs), I personally classify this model as an experimental model.

The model can separate the male and female voices in a chorus. However, if male and female are singing at intervals (one by one), they cannot be separated. The model separation effect can be heard here!

I used a total of 750 songs for training, of which 700 were used as the training set and 50 as the validation set. All the songs are from opencpop and m4singer datasets. Fine tuning training from model_bs_roformer_ep_317_sdr_12.9755.ckpt

Of these, model_chorus_bs_roformer_ep_267_sdr_24.1275.ckpt has the following validation values

Train epoch: 267
Instr male sdr: 24.4762 (Std: 1.5505)
Instr female sdr: 23.7788 (Std: 1.5168)
Metric avg sdr        : 24.1275

Thanks

Thanks to CN17161 for the GPU math support!

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference API
Unable to determine this model's library. Check the docs .