mohammadh128's picture
Update README.md
a1b4842
metadata
license: apache-2.0
tags:
  - generated_from_trainer
datasets:
  - mozilla-foundation/common_voice_11_0
metrics:
  - wer
model-index:
  - name: whisper_small-fa_v02
    results:
      - task:
          name: Automatic Speech Recognition
          type: automatic-speech-recognition
        dataset:
          name: mozilla-foundation/common_voice_11_0 fa
          type: mozilla-foundation/common_voice_11_0
          config: fa
          split: test
        metrics:
          - name: Wer
            type: wer
            value: 30.9315
language:
  - fa

whisper_small-fa_v02

This model is a fine-tuned version of openai/whisper-small on the mozilla-foundation/common_voice_11_0 fa dataset. We also did data augmentation using audiomentations library. It achieves the following results on the evaluation set:

  • Loss: 0.2291
  • Wer: 30.3423

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

You can Find the notebooks here.

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 500
  • training_steps: 5000
  • mixed_precision_training: Native AMP

Training results

Step Training Loss Validation Loss Wer
500 1.770700 0.476709 52.29181
1000 0.762300 0.368512 41.83410
1500 0.645000 0.323680 37.57881
2000 0.601900 0.297370 36.43209
2500 0.529700 0.276422 33.52608
3000 0.523200 0.260825 31.94485
3500 0.488400 0.249957 33.11771
4000 0.464800 0.241462 30.34238
4500 0.440500 0.233215 31.04969
5000 0.440500 0.229116 30.73605

Framework versions

  • Transformers 4.26.0
  • Pytorch 2.0.1+cu117
  • Datasets 2.8.0
  • Tokenizers 0.13.3