File size: 1,806 Bytes
fa5a527
 
 
 
 
 
cf7d3ed
fa5a527
 
 
 
 
 
 
4ff9aaf
3c1cc87
9396ae5
4ff9aaf
4b4bc47
6e1c8e5
9396ae5
6e1c8e5
a4bc194
fa5a527
 
af20c3a
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
---
license: apache-2.0
datasets:
- openslr/openslr
- google/fleurs
- PhanithLIM/rfi-news-dataset
- seanghay/km-speech-corpus
language:
- km
metrics:
- wer
base_model:
- openai/whisper-small
pipeline_tag: automatic-speech-recognition
widget:
  - src: output/1.wav
    example_title: Audio 1
    output:
      text: "ក្នុងរាត្រីកាលដ៏ស្ងប់ស្ងាត់មួយ បានផ្តិតជាប់នៅរូបភាពដ៏សែនសោកសង្រែងជាខ្លាំងចំពោះបុរសចំទង់ម៉ុនាស់"
  - src: output/2.wav
    example_title: Audio 2
    output:
      text: "ពុក កុំជាទៅដល់ហើយ!សុំទេវិត្តអាចពុកកុំអោយកើតឯងមុនពេលខ្ញុំទៅដល់!"
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->
This model is a fine-tuned version of [openai/whisper-small](https://huggingface.co/openai/whisper-small) on the None dataset.
It achieves the following results on the evaluation set:
- eval_loss: 0.18
- eval_wer: 65.4881 (0.654881)
- eval_runtime: 2738.0001
- eval_samples_per_second: 1.588
- eval_steps_per_second: 0.199
- epoch: 4.0
- step: 4345

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 1e-05
- train_batch_size: 16
- eval_batch_size: 8
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: constant
- lr_scheduler_warmup_steps: 1000
- num_epochs: 10

### Framework versions

- Transformers 4.45.2
- Pytorch 2.5.1+cu121
- Datasets 3.1.0
- Tokenizers 0.20.3