speechbrain
/

REAL-M-sisnr-estimator

audio-source-separation

Source Separation

Speech Separation

Model card Files Files and versions Community

cemsubakan commited on Oct 19, 2021

Commit

cc7479f

•

1 Parent(s): 297cd7c

Update README.md

Files changed (1) hide show

README.md +13 -21

README.md CHANGED Viewed

@@ -42,15 +42,16 @@ metrics:
 <br/><br/>
-# SepFormer trained on WHAMR! (16k sampling frequency)
-This repository provides all the necessary tools to perform audio source separation with a [SepFormer](https://arxiv.org/abs/2010.13154v2) model, implemented with SpeechBrain, and pretrained on [WHAMR!](http://wham.whisper.ai/) dataset with 16k sampling frequency, which is basically a version of WSJ0-Mix dataset with environmental noise and reverberation in 16k. For a better experience we encourage you to learn more about [SpeechBrain](https://speechbrain.github.io). The given model performance is 13.5 dB SI-SNRi on the test set of WHAMR! dataset.
-| Release | Test-Set SI-SNRi | Test-Set SDRi |
-|:-------------:|:--------------:|:--------------:|
-| 30-03-21 | 13.5 dB | 13.0 dB |
 ## Install SpeechBrain
@@ -67,20 +68,13 @@ Please notice that we encourage you to read our tutorials and learn more about [
 ### Perform source separation on your own audio file
 ```python
-from speechbrain.pretrained import SepformerSeparation as separator
-import torchaudio
-model = separator.from_hparams(source="speechbrain/sepformer-whamr16k", savedir='pretrained_models/sepformer-whamr16k')
-# for custom file, change path
-est_sources = model.separate_file(path='speechbrain/sepformer-whamr16k/test_mixture16k.wav')
-torchaudio.save("source1hat.wav", est_sources[:, :, 0].detach().cpu(), 16000)
-torchaudio.save("source2hat.wav", est_sources[:, :, 1].detach().cpu(), 16000)
 ```
@@ -117,11 +111,9 @@ pip install -e .
 3. Run Training:
 ```
-cd  recipes/WHAMandWHAMR/separation/
-python train.py hparams/sepformer-whamr.yaml --data_folder=your_data_folder --sample_rate=16000
 ```
 You can find our training results (models, logs, etc) [here](https://drive.google.com/drive/folders/1QiQhp1vi5t4UfNpNETA48_OmPiXnUy8O?usp=sharing).

 <br/><br/>
+# SI-SNR Estiamtor
+This repository provides the SI-SNR estimator model introduced for the REAL-M dataset.
+| Release | Test-Set (WHAMR!) average l1 error |
+|:-------------:|:--------------:|
+| 18-10-21 | 1.7 dB |
 ## Install SpeechBrain
 ### Perform source separation on your own audio file
 ```python
+model = separator.from_hparams(source="speechbrain/sepformer-whamr", savedir='pretrained_models/sepformer-whamr2')
+est_sources = model.separate_file(path='speechbrain/sepformer-wsj02mix/test_mixture.wav')
+snr_est_model = snrest.from_hparams(source="speechbrain/REAL-M-sisnr-estimator-main")
+mix, fs = torchaudio.load('test_mixture.wav')
+snrhat = snr_est_model.estimate_batch(mix, est_sources)
 ```
 3. Run Training:
 ```
+cd recipes/REAL-M/sisnr-estimation
+python train.py hparams/pool_sisnrestimator.yaml --data_folder /yourLibri2Mixpath --base_folder_dm /yourLibriSpeechpath --rir_path /yourpathforwhamrRIRs --dynamic_mixing True --use_whamr_train True --whamr_data_folder /yourpath/whamr --base_folder_dm_whamr /yourpath/wsj0-processed/si_tr_s
 ```
 You can find our training results (models, logs, etc) [here](https://drive.google.com/drive/folders/1QiQhp1vi5t4UfNpNETA48_OmPiXnUy8O?usp=sharing).