Instructions to use NjeriKahoro/whisper-small-dholuo with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use NjeriKahoro/whisper-small-dholuo with Transformers:
# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("NjeriKahoro/whisper-small-dholuo", dtype="auto") - Notebooks
- Google Colab
- Kaggle
whisper-small-dholuo
This model is a fine-tuned version of openai/whisper-small on an Anv-ke/Dholuo dataset. It achieves the following results on the evaluation set:
- Loss: 0.7701
- Wer: 44.3860
- Cer: 13.3355
Model description
Whisper-small-dholuo is a Dholuo model for turning speech to text. It was created by fine-tuning openAi Whisper/small model.
Intended uses & limitations
The intended use case is SST (Speech to text) for the Dholuo language. The main limititation is that it has not captured the orthographic nature of the language hence wrong transcription of some words.
Training and evaluation data
The training and evaluation dataset used is the Anv-ke/Dholuo(train) set. This was accomplished by a kind of a loop where the model picks 50 audios for training and skips the next 50 audios for evaluation.
Training procedure
To train the model, i imported the necessary libraries which include torch, evaluate, Wav2Vec2 processor and others. I then loaded the Anv-ke/Dholuo dataset by streaming method. After that i then loaded the Whisper-small model, prepared a data preprocessing function and a collator. I then initialized the training arguments shown below.
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-06
- train_batch_size: 1
- eval_batch_size: 8
- seed: 42
- gradient_accumulation_steps: 16
- total_train_batch_size: 16
- optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 50
- training_steps: 1200
- mixed_precision_training: Native AMP
Training results
| Training Loss | Epoch | Step | Validation Loss | Wer | Cer |
|---|---|---|---|---|---|
| 1.7654 | 0.0833 | 100 | 1.6712 | 59.8246 | 16.3678 |
| 1.7239 | 0.1667 | 200 | 1.5042 | 56.6667 | 19.1718 |
| 1.7802 | 0.25 | 300 | 1.2936 | 50.7018 | 16.1069 |
| 0.855 | 0.3333 | 400 | 0.9032 | 47.1930 | 14.6397 |
| 0.6151 | 0.4167 | 500 | 0.8899 | 46.3158 | 14.1506 |
| 0.7435 | 0.5 | 600 | 0.8364 | 44.9123 | 13.2051 |
| 0.9137 | 0.5833 | 700 | 0.8783 | 45.6140 | 14.4441 |
| 0.8981 | 0.6667 | 800 | 0.8198 | 44.7368 | 12.8790 |
| 0.8801 | 0.75 | 900 | 0.7919 | 44.3860 | 13.0095 |
| 0.7569 | 0.8333 | 1000 | 0.7833 | 91.5789 | 48.7773 |
| 0.7046 | 0.9167 | 1100 | 0.7743 | 44.2105 | 13.3029 |
| 1.1001 | 1.0 | 1200 | 0.7701 | 44.3860 | 13.3355 |
Framework versions
- Transformers 4.57.1
- Pytorch 2.8.0+cu126
- Datasets 3.6.0
- Tokenizers 0.22.1
- Downloads last month
- 15
Model tree for NjeriKahoro/whisper-small-dholuo
Base model
openai/whisper-small