metadata
license: apache-2.0
datasets:
- openslr/openslr
- google/fleurs
- PhanithLIM/rfi-news-dataset
- seanghay/km-speech-corpus
language:
- km
metrics:
- wer
base_model:
- openai/whisper-small
pipeline_tag: automatic-speech-recognition
widget:
- src: output/1.wav
example_title: Audio 1
output:
text: >-
ក្នុងរាត្រីកាលដ៏ស្ងប់ស្ងាត់មួយ
បានផ្តិតជាប់នៅរូបភាពដ៏សែនសោកសង្រែងជាខ្លាំងចំពោះបុរសចំទង់ម៉ុនាស់
- src: output/2.wav
example_title: Audio 2
output:
text: ពុក កុំជាទៅដល់ហើយ!សុំទេវិត្តអាចពុកកុំអោយកើតឯងមុនពេលខ្ញុំទៅដល់!
This model is a fine-tuned version of openai/whisper-small on the None dataset. It achieves the following results on the evaluation set:
- eval_loss: 0.18
- eval_wer: 65.4881 (0.654881)
- eval_runtime: 2738.0001
- eval_samples_per_second: 1.588
- eval_steps_per_second: 0.199
- epoch: 4.0
- step: 4345
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 1e-05
- train_batch_size: 16
- eval_batch_size: 8
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: constant
- lr_scheduler_warmup_steps: 1000
- num_epochs: 10
Framework versions
- Transformers 4.45.2
- Pytorch 2.5.1+cu121
- Datasets 3.1.0
- Tokenizers 0.20.3