FSMN-VAD · GGUF (FunASR llama.cpp runtime)

GGUF build of FunASR's FSMN-VAD for the zero-Python, CPU/edge FunASR llama.cpp runtime. Native ggml voice-activity detection: segment long audio entirely in C++, no Python at runtime.

Get it running (no Python, no build)

These are GGUF weights for the FunASR llama.cpp runtime — a whisper.cpp-style, single self-contained binary for CPU / edge. Grab a prebuilt binary, then fetch this model and run:

bash download-funasr-model.sh fsmn-vad ./gguf
# fsmn-vad is the VAD used by the ASR runtimes via --vad (see the SenseVoice / Paraformer / Fun-ASR-Nano GGUF repos)

Files

file size notes
fsmn-vad.gguf 1.7 MB FSMN encoder + CMVN

Usage

Pass --vad to any FunASR llama.cpp tool to segment long audio internally:

llama-funasr-sensevoice -m sensevoice-small.gguf -a long.wav --vad fsmn-vad.gguf
llama-funasr-cli --enc funasr-encoder-f16.gguf -m qwen3-0.6b-q8_0.gguf -a long.wav --vad fsmn-vad.gguf

Segment boundaries match the PyTorch fsmn-vad front end within ~10 ms.

Links

Downloads last month
58
GGUF
Model size
430k params
Architecture
fsmn-vad
Hardware compatibility
Log In to add your hardware

We're not able to determine the quantization variants.

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support