cemsubakan commited on
Commit
cc7479f
1 Parent(s): 297cd7c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +13 -21
README.md CHANGED
@@ -42,15 +42,16 @@ metrics:
42
 
43
  <br/><br/>
44
 
45
- # SepFormer trained on WHAMR! (16k sampling frequency)
46
 
47
- This repository provides all the necessary tools to perform audio source separation with a [SepFormer](https://arxiv.org/abs/2010.13154v2) model, implemented with SpeechBrain, and pretrained on [WHAMR!](http://wham.whisper.ai/) dataset with 16k sampling frequency, which is basically a version of WSJ0-Mix dataset with environmental noise and reverberation in 16k. For a better experience we encourage you to learn more about [SpeechBrain](https://speechbrain.github.io). The given model performance is 13.5 dB SI-SNRi on the test set of WHAMR! dataset.
48
 
49
- | Release | Test-Set SI-SNRi | Test-Set SDRi |
50
 
51
- |:-------------:|:--------------:|:--------------:|
52
 
53
- | 30-03-21 | 13.5 dB | 13.0 dB |
 
 
54
 
55
  ## Install SpeechBrain
56
 
@@ -67,20 +68,13 @@ Please notice that we encourage you to read our tutorials and learn more about [
67
  ### Perform source separation on your own audio file
68
 
69
  ```python
 
 
70
 
71
- from speechbrain.pretrained import SepformerSeparation as separator
72
-
73
- import torchaudio
74
-
75
- model = separator.from_hparams(source="speechbrain/sepformer-whamr16k", savedir='pretrained_models/sepformer-whamr16k')
76
-
77
- # for custom file, change path
78
 
79
- est_sources = model.separate_file(path='speechbrain/sepformer-whamr16k/test_mixture16k.wav')
80
-
81
- torchaudio.save("source1hat.wav", est_sources[:, :, 0].detach().cpu(), 16000)
82
-
83
- torchaudio.save("source2hat.wav", est_sources[:, :, 1].detach().cpu(), 16000)
84
 
85
  ```
86
 
@@ -117,11 +111,9 @@ pip install -e .
117
  3. Run Training:
118
 
119
  ```
 
120
 
121
- cd recipes/WHAMandWHAMR/separation/
122
-
123
- python train.py hparams/sepformer-whamr.yaml --data_folder=your_data_folder --sample_rate=16000
124
-
125
  ```
126
 
127
  You can find our training results (models, logs, etc) [here](https://drive.google.com/drive/folders/1QiQhp1vi5t4UfNpNETA48_OmPiXnUy8O?usp=sharing).
 
42
 
43
  <br/><br/>
44
 
45
+ # SI-SNR Estiamtor
46
 
47
+ This repository provides the SI-SNR estimator model introduced for the REAL-M dataset.
48
 
 
49
 
50
+ | Release | Test-Set (WHAMR!) average l1 error |
51
 
52
+ |:-------------:|:--------------:|
53
+
54
+ | 18-10-21 | 1.7 dB |
55
 
56
  ## Install SpeechBrain
57
 
 
68
  ### Perform source separation on your own audio file
69
 
70
  ```python
71
+ model = separator.from_hparams(source="speechbrain/sepformer-whamr", savedir='pretrained_models/sepformer-whamr2')
72
+ est_sources = model.separate_file(path='speechbrain/sepformer-wsj02mix/test_mixture.wav')
73
 
74
+ snr_est_model = snrest.from_hparams(source="speechbrain/REAL-M-sisnr-estimator-main")
 
 
 
 
 
 
75
 
76
+ mix, fs = torchaudio.load('test_mixture.wav')
77
+ snrhat = snr_est_model.estimate_batch(mix, est_sources)
 
 
 
78
 
79
  ```
80
 
 
111
  3. Run Training:
112
 
113
  ```
114
+ cd recipes/REAL-M/sisnr-estimation
115
 
116
+ python train.py hparams/pool_sisnrestimator.yaml --data_folder /yourLibri2Mixpath --base_folder_dm /yourLibriSpeechpath --rir_path /yourpathforwhamrRIRs --dynamic_mixing True --use_whamr_train True --whamr_data_folder /yourpath/whamr --base_folder_dm_whamr /yourpath/wsj0-processed/si_tr_s
 
 
 
117
  ```
118
 
119
  You can find our training results (models, logs, etc) [here](https://drive.google.com/drive/folders/1QiQhp1vi5t4UfNpNETA48_OmPiXnUy8O?usp=sharing).