Böri — Kazakh ASR (bori-asr)

Whisper large-v3-turbo fine-tuned (LoRA, merged) on Kazakh Speech Corpus 2.

Metrics (500-sample eval, greedy)

  • CER: 3.84%
  • WER: 17.5%

Usage

import torch, librosa, numpy as np
from transformers import WhisperForConditionalGeneration, WhisperProcessor
m = WhisperForConditionalGeneration.from_pretrained('zhdoka/bori-asr').eval()
p = WhisperProcessor.from_pretrained('zhdoka/bori-asr')
a,_ = librosa.load('audio.wav', sr=16000, mono=True); a = a/ (np.abs(a).max() or 1); a = a[:30*16000]
feat = p(a, sampling_rate=16000, return_tensors='pt').input_features
ids = m.generate(feat, language='kazakh', task='transcribe', num_beams=1, max_new_tokens=225)
print(p.batch_decode(ids, skip_special_tokens=True)[0].lower().strip())
Downloads last month
4
Safetensors
Model size
0.8B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for zhdoka/bori-asr

Finetuned
(537)
this model