Configuration Parsing Warning:Invalid JSON for config file config.json

Lightweight-Speech-Denoising

This project provides lightweight speech enhancement (denoising) models optimized for Axera NPU platforms, combining the DSP framework of RNNoise and the model architecture of GTCRN.

Key Features

Good Denoising Quality — Built on GTCRN and RNNoise frameworks; strong noise suppression even at very low parameter counts.
Ultra-Lightweight Models — Smallest model under 100 KB; CMM memory footprint below 150 KB.
Minimal Operators — tiny_v5 and conv_se are pure convolutional models with very few operator types; tiny_v5 supports quantization on the operator-limited AX525 platform.
End-to-End Workflow — Full pipeline. Full project with export / quantization scripts
Multi-Platform Support — Board inference validated on AX620Q, AX630C, and AX650; PTQ quantization also available for AX620L, AX637, and AX525.

Convert tools links

For those interested in model conversion, refer to:

Support Platform

AX650
- M4N-Dock（爱芯派Pro）
- M.2 Accelerator Card
AX630C
AX620Q
AX620L
AX637
AX525（only for tiny_v5）

How to use

Directory layout on device:

root@ax650:~# tree Lightweight-Speech-Denoising.axera
Lightweight-Speech-Denoising.axera
├── README.md
├── axmodels
│   ├── ax525_tiny_v5_setrain.axmodel
│   ├── ax620E_conv_se_setrain.axmodel
│   ├── ax620E_gtcrn_setrain.axmodel
│   ├── ax620E_tiny_v5_setrain.axmodel
│   ├── ax620L_conv_se_setrain.axmodel
│   ├── ax620L_gtcrn_setrain.axmodel
│   ├── ax620L_tiny_v5_setrain.axmodel
│   ├── ax637_conv_se_setrain.axmodel
│   ├── ax637_gtcrn_setrain.axmodel
│   ├── ax637_tiny_v5_setrain.axmodel
│   ├── ax630c_conv_se_setrain.axmodel
│   ├── ax630c_gtcrn_setrain.axmodel
│   ├── ax630c_tiny_v5_setrain.axmodel
│   ├── ax650_conv_se_setrain.axmodel
│   ├── ax650_gtcrn_setrain.axmodel
│   └── ax650_tiny_v5_setrain.axmodel
├── build_ax620q
│   └── test_se_denoise_ax
├── build_ax630c
│   └── test_se_denoise_ax
├── build_ax650
│   └── test_se_denoise_ax
├── models
│   ├── conv_se_ax650_config.ini
│   ├── gtcrn_7input_ax650_config.ini
│   ├── tiny_v5_ax650_config.ini
│   └── ...
├── run_ax620q_all.sh
├── run_ax630c_all.sh
├── run_ax650_all.sh
└── test_wavs
    └── mix.wav

Download all files from this repository to the device, then run the corresponding script for your platform:

# AX650
sh run_ax650_all.sh

# AX630C
sh run_ax630c_all.sh

# AX620Q
sh run_ax620q_all.sh

Output .wav files will be saved to output/<platform>_all/.

Inference Results

tiny_v5

Platform	Avg Infer (ms)	RTF	Realtime Speedup
AX650	0.160	0.0117	85.3x
AX630C	0.587	0.0232	43.1x
AX620Q	0.736	0.0332	30.1x
AX620L	TBD	TBD	TBD
AX637	TBD	TBD	TBD
AX525	TBD	TBD	TBD

conv_se

Platform	Avg Infer (ms)	RTF	Realtime Speedup
AX650	1.963	0.0365	27.4x
AX630C	7.803	0.1092	9.2x
AX620Q	14.938	0.1970	5.1x
AX620L	TBD	TBD	TBD
AX637	TBD	TBD	TBD

GTCRN

Platform	Avg Infer (ms)	RTF	Realtime Speedup
AX650	2.766	0.1756	5.7x
AX630C	2.835	0.1820	5.5x
AX620Q	3.535	0.2295	4.4x
AX620L	TBD	TBD	TBD
AX637	TBD	TBD	TBD

Test audio: mix.wav, duration 9.77s, 16kHz mono.

TODO

AX525 board inference (quantization done)
AX620L board inference (quantization done)
AX637 board inference (quantization done)

References

RNNoise — Mozilla open-source DSP + RNN noise suppression framework; STFT/iSTFT and kiss_fft implementation reused in this project. https://github.com/xiph/rnnoise
GTCRN — Lightweight Gated Temporal Convolutional Recurrent Network for speech enhancement. https://github.com/Xiaobin-Rong/gtcrn

Downloads last month: 49