whisper-klein-nl

A tiny Dutch (nl) ASR model — a fine-tune of openai/whisper-tiny (39M params) on LokaalHub/nl-asr-cv. Built for on-device use: small footprint, low real-time factor on CPU.

TL;DR

A single fine-tune on 74.4h of Dutch Common Voice takes WER from ~44.03% (base Whisper-tiny) to **22.41%** (49.1% relative drop) on a held-out, speaker- and sentence-disjoint test split.

3-axis evaluation (accuracy / footprint / speed)

All systems scored on the same held-out panel through one shared text normalizer (BasicTextNormalizer). RTF = CPU compute seconds per audio second (lower is faster).

Model params size (fp32) RTF (CPU) cv17-test fleurs-test mean WER
LokaalHub/whisper-klein-nl (ours) 58M 230.7 MB 0.161 28.63 40.13 34.38%
openai/whisper-tiny 38M 151.0 MB 0.091 46.15 49.14 47.64%

Usage

from transformers import pipeline
asr = pipeline("automatic-speech-recognition", model="LokaalHub/whisper-klein-nl")
asr("audio.wav", generate_kwargs={"language": "nl", "task": "transcribe"})

Training

Standard Hugging Face Seq2SeqTrainer fine-tune (bf16), built and verified by the tiny-asr-loop pipeline.

Limitations

Tiny-model fine-tune on read speech (Common Voice). The internal test split is small and speaker-disjoint — see the panel table for FLEURS / out-of-domain numbers.

Downloads last month
97
Safetensors
Model size
57.7M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for LokaalHub/whisper-klein-nl

Finetuned
(1847)
this model

Dataset used to train LokaalHub/whisper-klein-nl

Evaluation results