cemsubakan commited on
Commit
fd72f65
1 Parent(s): 3bde974

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +81 -0
README.md ADDED
@@ -0,0 +1,81 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language: "en"
3
+ thumbnail:
4
+ tags:
5
+ - Source Separation
6
+ - Speech Separation
7
+ - Audio Source Separation
8
+ - WHAM!
9
+ - SepFormer
10
+ - Transformer
11
+ license: "apache-2.0"
12
+ datasets:
13
+ - WHAM!
14
+ metrics:
15
+ - SI-SNRi
16
+ - SDRi
17
+
18
+ ---
19
+
20
+ # SepFormer trained on WHAM!
21
+ This repository provides all the necessary tools to perform audio source separation with a [SepFormer](https://arxiv.org/abs/2010.13154v2)
22
+ model, implemented with SpeechBrain, and pretrained on [WHAM!](http://wham.whisper.ai/) dataset. For a better experience we encourage you to learn more about
23
+ [SpeechBrain](https://speechbrain.github.io). The given model performance is 16.3 dB SI-SNRi on the test set of WHAM! dataset.
24
+
25
+ | Release | Test-Set SI-SNRi | Test-Set SDRi |
26
+ |:-------------:|:--------------:|:--------------:|
27
+ | 09-03-21 | 16.3 dB | 16.7 dB |
28
+
29
+
30
+ ## Install SpeechBrain
31
+
32
+ First of all, please install SpeechBrain with the following command:
33
+
34
+ ```
35
+ pip install \\we hide ! SpeechBrain is still private :p
36
+ ```
37
+
38
+ Please notice that we encourage you to read our tutorials and learn more about
39
+ [SpeechBrain](https://speechbrain.github.io).
40
+
41
+ ### Perform source separation on your own audio file
42
+
43
+ ```python
44
+
45
+ from speechbrain.pretrained import separator
46
+ import torchaudio
47
+
48
+ model = separator.from_hparams(source="speechbrain/sepformer-wham")
49
+
50
+ mix, fs = torchaudio.load("yourspeechbrainpath/samples/audio_samples/test_mixture.wav")
51
+
52
+ est_sources = model.separate(mix)
53
+ est_sources = est_sources / est_sources.max(dim=1, keepdim=True)[0]
54
+
55
+ torchaudio.save("source1hat.wav", est_sources[:, :, 0].detach().cpu(), 8000)
56
+ torchaudio.save("source2hat.wav", est_sources[:, :, 1].detach().cpu(), 8000)
57
+
58
+ ```
59
+
60
+ #### Referencing SpeechBrain
61
+
62
+ ```
63
+ @misc{SB2021,
64
+ author = {Ravanelli, Mirco and Parcollet, Titouan and Rouhe, Aku and Plantinga, Peter and Rastorgueva, Elena and Lugosch, Loren and Dawalatabad, Nauman and Ju-Chieh, Chou and Heba, Abdel and Grondin, Francois and Aris, William and Liao, Chien-Feng and Cornell, Samuele and Yeh, Sung-Lin and Na, Hwidong and Gao, Yan and Fu, Szu-Wei and Subakan, Cem and De Mori, Renato and Bengio, Yoshua },
65
+ title = {SpeechBrain},
66
+ year = {2021},
67
+ publisher = {GitHub},
68
+ journal = {GitHub repository},
69
+ howpublished = {\url{https://github.com/speechbrain/speechbrain}},
70
+ }
71
+ ```
72
+
73
+ #### Referencing SepFormer
74
+ ```
75
+ @inproceedings{subakan2021attention,
76
+ title={Attention is All You Need in Speech Separation},
77
+ author={Cem Subakan and Mirco Ravanelli and Samuele Cornell and Mirko Bronzi and Jianyuan Zhong},
78
+ year={2021},
79
+ booktitle={ICASSP 2021}
80
+ }
81
+ ```