PhanithLIM's picture
Update README.md
702c4af verified
metadata
library_name: transformers
license: apache-2.0
base_model: openai/whisper-base
tags:
  - generated_from_trainer
metrics:
  - wer
datasets:
  - PhanithLIM/ams-speech-dataset
  - openslr/openslr
  - google/fleurs
  - PhanithLIM/kh-wmc
  - PhanithLIM/wmc-international-news
  - PhanithLIM/rfi-news-dataset
  - PhanithLIM/aakanee-kh
  - rinabuoy/khm-asr-open
  - seanghay/khmer_grkpp_speech
  - seanghay/khmer_mpwt_speech
  - seanghay/km-speech-corpus
model-index:
  - name: Khmer Whisper Small PhanithLIM
    results:
      - task:
          name: Automatic Speech Recognition
          type: automatic-speech-recognition
        dataset:
          name: Google FLEURS
          type: google/fleurs
          config: km_kh
          split: test
        metrics:
          - name: CER
            type: cer
            value: 14.414

whisper-base-aug-20-april-lightning-v1

This model is a fine-tuned version of openai/whisper-base on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1044
  • Wer: 85.2539

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 32
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: constant
  • lr_scheduler_warmup_steps: 1000
  • num_epochs: 10

Training results

Training Loss Epoch Step Validation Loss Wer
0.5768 1.0 1424 0.2062 98.4412
0.1775 2.0 2848 0.1505 89.7549
0.1321 3.0 4272 0.1304 86.5233
0.109 4.0 5696 0.1184 87.7905
0.0935 5.0 7120 0.1108 83.8661
0.0815 6.0 8544 0.1072 85.3635
0.0722 7.0 9968 0.1058 84.4405
0.0644 8.0 11392 0.1049 82.3862
0.0575 9.0 12816 0.1049 84.2761
0.0521 9.9933 14230 0.1044 85.2539

Framework versions

  • Transformers 4.51.3
  • Pytorch 2.2.1+cu121
  • Datasets 3.5.0
  • Tokenizers 0.21.1