FSMN-VAD · GGUF (FunASR llama.cpp runtime)

GGUF build of FunASR's FSMN-VAD for the zero-Python, CPU/edge FunASR llama.cpp runtime. Native ggml voice-activity detection: segment long audio entirely in C++, no Python at runtime.

Get it running (no Python, no build)

These are GGUF weights for the FunASR llama.cpp runtime — a whisper.cpp-style, single self-contained binary for CPU / edge. Grab a prebuilt binary, then fetch this model and run:

Prebuilt binaries (Linux / macOS / Windows) → GitHub Releases (tag runtime-llamacpp-v*)
One-page quickstart & benchmarks → funasr.com/llama-cpp

bash download-funasr-model.sh fsmn-vad ./gguf
# fsmn-vad is the VAD used by the ASR runtimes via --vad (see the SenseVoice / Paraformer / Fun-ASR-Nano GGUF repos)

Files

file	size	notes
`fsmn-vad.gguf`	1.7 MB	FSMN encoder + CMVN

Usage

Pass --vad to any FunASR llama.cpp tool to segment long audio internally:

llama-funasr-sensevoice -m sensevoice-small.gguf -a long.wav --vad fsmn-vad.gguf
llama-funasr-cli --enc funasr-encoder-f16.gguf -m qwen3-0.6b-q8_0.gguf -a long.wav --vad fsmn-vad.gguf

Segment boundaries match the PyTorch fsmn-vad front end within ~10 ms.

FunAudioLLM
/

fsmn-vad-GGUF

FSMN-VAD · GGUF (FunASR llama.cpp runtime)

Get it running (no Python, no build)

Files

Usage

Links