Fun-ASR-MLT-Nano-2512 โ GGUF (ggml-quantised)
GGUF / ggml conversion of FunAudioLLM/Fun-ASR-MLT-Nano-2512 for use with the funasr backend in CrispStrobe/CrispASR. Multilingual variant โ same architecture as funasr-nano-GGUF, broader language coverage (~31 languages including Korean, Vietnamese, Indonesian, Thai, Malay, Filipino, Arabic, Hindi, Bulgarian, German, French, Spanish, Italian, Portuguese, Dutch, Polish, Czech, Romanian, Greek, Finnish, Swedish, Turkish, Persian, Danish, Hungarian, Macedonian, Russian).
The architecture is identical to Fun-ASR-Nano-2512:
- 70-block SenseVoiceSmall SANM encoder (1 entry block @ 560โ512 + 49 main blocks + 20 "tp" blocks, all 512-dim, 4 heads, FSMN k=11 depthwise convolution branch)
- 2-block Transformer audio adaptor (512 โ 2048 โ 1024 prelude + 2ร MHA blocks at 1024, FFN inner = 256)
- Qwen3-0.6B LLM decoder (28 layers, GQA 16/8, head_dim 128, RoPE ฮธ=1e6, RMSNorm eps=1e-6) โ the same body as Qwen3-ASR's decoder
- 1261 tensors total; only the LLM weights are tuned differently between Nano and MLT-Nano.
Architecture note โ no CTC path
Upstream config.yaml and funasr/models/fun_asr_nano/model.py declare a CTC decoder + head, but the published model.pt ships only audio_encoder.* + audio_adaptor.* + llm.* (zero ctc_decoder.* / ctc.ctc_lo.* keys). The LLM-decoder path is therefore the only viable inference path for these weights, and is what this GGUF and the CrispASR runtime implement.
Files
| File | Size | Notes |
|---|---|---|
funasr-mlt-nano-2512-f16.gguf |
1.98 GB | F16, full precision reference |
funasr-mlt-nano-2512-q8_0.gguf |
1.27 GB | Q8_0, near-lossless |
funasr-mlt-nano-2512-q4_k.gguf |
897 MB | Q4_K โ recommended default |
Mixed-case + punctuation output (mlt-nano's multilingual vocab keeps
both); no --punc-model post-processor needed.
Quick Start
git clone https://github.com/CrispStrobe/CrispASR
cd CrispASR
cmake -B build-ninja-compile -G Ninja -DCMAKE_BUILD_TYPE=Release
cmake --build build-ninja-compile --target crispasr
# Auto-download (recommended Q4_K)
./build-ninja-compile/bin/crispasr -m fun-asr-mlt-nano --auto-download -f samples/jfk.wav
# Or pin a specific file
hf download cstr/funasr-mlt-nano-GGUF funasr-mlt-nano-2512-q4_k.gguf --local-dir .
./build-ninja-compile/bin/crispasr -m funasr-mlt-nano-2512-q4_k.gguf -f samples/jfk.wav
Licence + attribution
Upstream FunAudioLLM/Fun-ASR-MLT-Nano-2512:
- Code (the
funasrPython package): Apache-2.0. - Model weights: FunASR Model License v1.1 (Alibaba) โ commercial use OK with attribution. Confirmed on the upstream-tracking discussion in CrispStrobe/CrispASR#99.
These GGUF files are a quantised / repackaged distribution of the upstream weights and inherit the FunASR Model License v1.1. Please attribute Alibaba / FunAudioLLM in downstream products.
If you use this model, please also cite the upstream FunASR work. See the upstream model card for the canonical citation.
- Downloads last month
- 266
8-bit
16-bit
Model tree for cstr/funasr-mlt-nano-GGUF
Base model
FunAudioLLM/Fun-ASR-MLT-Nano-2512