Whisper Small β Igbo
Part of Olu Igbo ("Igbo Voice") β an offline, on-device Igbo speech recognition project built for the Arm Create: AI Optimization Challenge 2026. See the full project on GitHub β
A LoRA fine-tune of openai/whisper-small for Igbo automatic speech recognition. Igbo isn't one of Whisper's 99 native languages, so this uses the <|yo|> (Yoruba) language token as a proxy during both training and inference.
Results
62.45% WER on the FLEURS Igbo test set (969 samples), down from a 68.95% baseline β verified on the full test set, not estimated from training metrics.
Training data
- FLEURS Igbo (
ig_ngconfig) β 2,839 training examples - Nigerian Common Voice Dataset (Igbo subset) β 4,571 training examples
Usage
from transformers import WhisperProcessor, WhisperForConditionalGeneration
from peft import PeftModel
import torch
processor = WhisperProcessor.from_pretrained("openai/whisper-small")
base_model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-small")
model = PeftModel.from_pretrained(base_model, "theelvace/whisper-small-igbo")
model = model.merge_and_unload()
# Igbo language token proxy β see note above
forced_decoder_ids = [[1, 50325], [2, 50359], [3, 50363]]
inputs = processor.feature_extractor(audio_array, sampling_rate=16000, return_tensors="pt")
generated_ids = model.generate(inputs.input_features, forced_decoder_ids=forced_decoder_ids)
transcription = processor.tokenizer.decode(generated_ids[0], skip_special_tokens=True)
On-device deployment
This repo also hosts ONNX exports (encoder, cross-attention initializer, KV-cache decoder) used to run this model fully on-device on Android β no cloud inference. See the Olu Igbo GitHub repo for the full mobile app, export scripts, and benchmarks on a Snapdragon 678 device.
Limitations
- 62.45% WER reflects real, measured performance, not a polished demo number β short, clear utterances transcribe more reliably than long or complex ones.
- Performance on live microphone audio in real-world noise conditions will generally be lower than the FLEURS test set figure, which is measured on clean studio recordings.
License
MIT
Model tree for theelvace/whisper-small-igbo
Base model
openai/whisper-smallDatasets used to train theelvace/whisper-small-igbo
benjaminogbonna/nigerian_common_voice_dataset
Evaluation results
- wer on FLEURS (ig_ng)self-reported62.450