yipjiaqi commited on
Commit
615e148
1 Parent(s): 9762049

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +38 -0
README.md ADDED
@@ -0,0 +1,38 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ ---
4
+
5
+ ## Demo
6
+
7
+ A demo with instructions on how to run inference on the model is available as a colab notebook [here](https://colab.research.google.com/drive/1zKEaRFNITve7WPsqVNUuaRXiduR7H1Ki?usp=sharing)
8
+
9
+ The standalone version of this model for inference with minimal dependencies [here](https://github.com/Yip-Jia-Qi/spgm_standalone)
10
+
11
+ Training is handled by speechbrain. This can be done through my fork of the speechbrain repository found [here](https://github.com/Yip-Jia-Qi/speechbrain/tree/add_spgm).
12
+
13
+ ## Results
14
+ Here are the SI - SNRi results (in dB) on the test set of WSJ0-2 Mix:
15
+
16
+ |Model| Data Augmentation | WSJ0-2Mix (SI-SNRi)|
17
+ | --- |--- | --- |
18
+ |spgm (paper)|SpeedPerturb | 22.1 |
19
+ |[spgm-base](https://huggingface.co/yipjiaqi/spgm-base)|DynamicMixing | 22.7 |
20
+ |[spgm-opt](https://huggingface.co/yipjiaqi/spgm-opt)|DynamicMixing | 23.0 |
21
+
22
+ In the original paper accepted to ICASSP, the only data augmentation used was speed perturbation. Subsequently we trained the model using dynamic mixing, which yielded improvements in performance.
23
+
24
+ Additionally, after further exploring some hyperparameters, we obtain an optimized version of SPGM, spgm-opt that achieved 23.0dB SI-SDRi
25
+
26
+ The weights and config of spgm-base and spgm-opt have been uploaded to huggingface and can be accessed using the code in the spgm_standalone [repo](https://github.com/Yip-Jia-Qi/spgm_standalone).
27
+
28
+ ## Citation
29
+
30
+ If you find this model useful, please cite:
31
+ ```bibtex
32
+ @INPROCEEDINGS{yip2023spgm,
33
+ author={Yip, Jia Qi and Zhao, Shengkui and Ma, Yukun and Ni, Chongjia and Zhang, Chong and Wang, Hao and Nguyen, Trung Hieu and Zhou, Kun and Ng, Dianwen and Chng, Eng Siong and others},
34
+ booktitle={ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)},
35
+ title={SPGM: Prioritizing Local Features for enhanced speech separation performance},
36
+ year={2024},
37
+ }
38
+ ```