Edit model card

This is a pre-trained version of Fast FullSubNet, a real-time denoising model trained on the Deep Noise Suppression Challenge dataset of 2020 (DNS-INTERSPEECH-2020).

How to run

https://fullsubnet.readthedocs.io/en/latest/usage/getting_started.html

Code

https://github.com/Audio-WestlakeU/FullSubNet

Note: The code doesn't support real-time streaming out of the box. See issue-67 for details.

Paper

Fast FullSubNet: Accelerate Full-band and Sub-band Fusion Model for Single-channel Speech Enhancement, Xiang Hao, Xiaofei Li

For many speech enhancement applications, a key feature is that system runs on a real-time, latency-sensitive, battery-powered platform, which strictly limits the algorithm latency and computational complexity. In this work, we propose a new architecture named Fast FullSubNet dedicated to accelerating the computation of FullSubNet. Specifically, Fast FullSubNet processes sub-band speech spectra in the mel-frequency domain by using cascaded linear-to-mel full-band, sub-band, and mel-to-linear full-band models such that frequencies involved in the sub-band computation are vastly reduced. After that, a down-sampling operation is proposed for the sub-band input sequence to further reduce the computational complexity along the time axis. Experimental results show that, compared to FullSubNet, Fast FullSubNet has only 13% computational complexity and 16% processing time, and achieves comparable or even better performance.

Performance

With Reverb No Reverb
Method WB-PESQ NB-PESQ SI-SDR STOI WB-PESQ NB-PESQ SI-SDR
Fast FullSubNet (118 Epochs) 2.882 3.42 15.33 0.9233 2.694 3.222 16.34
FullSubNet (58 Epochs) (just for comparison) 2.987 3.496 15.756 0.926 2.889 3.385 17.635
Downloads last month
0
Inference API
or
Unable to determine this model's library. Check the docs .