Vekol

Vekol-STT (Sorani, edge) — whisper-base

Central Kurdish (Sorani) speech-to-text that runs offline on CPU. A small Whisper model fine-tuned for Sorani, transcribing audio faster than real time on a laptop CPU. Part of the Vekol hub by Revge.

Model: vekol-stt-ckb-base (fine-tuned from openai/whisper-base, 74M)
Language: Central Kurdish / Sorani (ckb), Arabic script
Task: speech-to-text (transcription)
Accuracy: 29.0% WER, 7.95% CER (spacing-free) on the speaker-disjoint Common Voice 25 test
Size: 72 / 36 MB (int8 / int4)
Runtime: ONNX Runtime (torch-free, used by the helper) or transformers / PyTorch — both formats included

License

CC-BY-NC 4.0 (non-commercial). Fine-tuned from OpenAI Whisper (MIT). The weights here are released non-commercial to keep the hosted service (vekol.krd) sustainable. See NOTICE. Commercial use needs a license — use the hosted API or get in touch.

Usage

The simplest path is the vekol_stt.py helper from the GitHub repo, which downloads this model and handles Sorani normalization (ONNX Runtime + numpy, no PyTorch):

pip install transformers librosa torch
python3 vekol_stt.py audio.wav --model base

Or directly with transformers. Decode with language="fa" — Whisper has no Sorani token, so this model uses the Persian token as a script anchor:

import librosa
from transformers import WhisperProcessor, WhisperForConditionalGeneration

proc = WhisperProcessor.from_pretrained("RevgeAI/vekol-stt-ckb-base")
model = WhisperForConditionalGeneration.from_pretrained("RevgeAI/vekol-stt-ckb-base").eval()

audio, _ = librosa.load("audio.wav", sr=16000)
feats = proc.feature_extractor(audio, sampling_rate=16000, return_tensors="pt").input_features
ids = model.generate(feats, task="transcribe", language="fa", max_new_tokens=225)
print(proc.tokenizer.decode(ids[0], skip_special_tokens=True))

Notes

Trained on Common Voice 25.0 (ckb) and FLEURS (ckb_iq), normalized to Sorani (Arabic to Kurdish letter/digit folding; diacritics, ZWNJ and tatweel stripped).
Accuracy is on the official speaker-disjoint test split (no speaker leakage). CER is spacing-free because Kurdish has no standard word-spacing.
For the large models (down to ~1.9% CER) and real-time streaming, use vekol.krd.

Citation

@software{vekol_stt_ckb_edge,
  title        = {Vekol-STT: Sorani (Central Kurdish) on-device STT},
  author       = {Shvan, Darvan},
  organization = {Revge},
  year         = {2026},
  url          = {https://github.com/Revge/vekol-stt-ckb-edge}
}

Built by Darvan Shvan at Revge, part of the Vekol hub.

Downloads last month: -

Safetensors

Model size

72.6M params

Tensor type

F32

Model tree for RevgeAI/vekol-stt-ckb-base

Base model

openai/whisper-base

Quantized

(224)

this model