You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

whisper-small-dholuo

This model is a fine-tuned version of openai/whisper-small on an Anv-ke/Dholuo dataset. It achieves the following results on the evaluation set:

  • Loss: 0.7701
  • Wer: 44.3860
  • Cer: 13.3355

Model description

Whisper-small-dholuo is a Dholuo model for turning speech to text. It was created by fine-tuning openAi Whisper/small model.

Intended uses & limitations

The intended use case is SST (Speech to text) for the Dholuo language. The main limititation is that it has not captured the orthographic nature of the language hence wrong transcription of some words.

Training and evaluation data

The training and evaluation dataset used is the Anv-ke/Dholuo(train) set. This was accomplished by a kind of a loop where the model picks 50 audios for training and skips the next 50 audios for evaluation.

Training procedure

To train the model, i imported the necessary libraries which include torch, evaluate, Wav2Vec2 processor and others. I then loaded the Anv-ke/Dholuo dataset by streaming method. After that i then loaded the Whisper-small model, prepared a data preprocessing function and a collator. I then initialized the training arguments shown below.

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-06
  • train_batch_size: 1
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 16
  • total_train_batch_size: 16
  • optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 50
  • training_steps: 1200
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Wer Cer
1.7654 0.0833 100 1.6712 59.8246 16.3678
1.7239 0.1667 200 1.5042 56.6667 19.1718
1.7802 0.25 300 1.2936 50.7018 16.1069
0.855 0.3333 400 0.9032 47.1930 14.6397
0.6151 0.4167 500 0.8899 46.3158 14.1506
0.7435 0.5 600 0.8364 44.9123 13.2051
0.9137 0.5833 700 0.8783 45.6140 14.4441
0.8981 0.6667 800 0.8198 44.7368 12.8790
0.8801 0.75 900 0.7919 44.3860 13.0095
0.7569 0.8333 1000 0.7833 91.5789 48.7773
0.7046 0.9167 1100 0.7743 44.2105 13.3029
1.1001 1.0 1200 0.7701 44.3860 13.3355

Framework versions

  • Transformers 4.57.1
  • Pytorch 2.8.0+cu126
  • Datasets 3.6.0
  • Tokenizers 0.22.1
Downloads last month
15
Safetensors
Model size
0.2B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for NjeriKahoro/whisper-small-dholuo

Finetuned
(3559)
this model

Space using NjeriKahoro/whisper-small-dholuo 1