metadata

license: apache-2.0
tags:
  - generated_from_trainer
datasets:
  - mozilla-foundation/common_voice_11_0
metrics:
  - wer
model-index:
  - name: whisper_small-fa_v02
    results:
      - task:
          name: Automatic Speech Recognition
          type: automatic-speech-recognition
        dataset:
          name: mozilla-foundation/common_voice_11_0 fa
          type: mozilla-foundation/common_voice_11_0
          config: fa
          split: test
        metrics:
          - name: Wer
            type: wer
            value: 30.9315
language:
  - fa

whisper_small-fa_v02

This model is a fine-tuned version of openai/whisper-small on the mozilla-foundation/common_voice_11_0 fa dataset. We also did data augmentation using audiomentations library. It achieves the following results on the evaluation set:

Loss: 0.2291
Wer: 30.3423

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

You can Find the notebooks here.

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 1e-05
train_batch_size: 8
eval_batch_size: 8
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 500
training_steps: 5000
mixed_precision_training: Native AMP

Training results

Step	Training Loss	Validation Loss	Wer
500	1.770700	0.476709	52.29181
1000	0.762300	0.368512	41.83410
1500	0.645000	0.323680	37.57881
2000	0.601900	0.297370	36.43209
2500	0.529700	0.276422	33.52608
3000	0.523200	0.260825	31.94485
3500	0.488400	0.249957	33.11771
4000	0.464800	0.241462	30.34238
4500	0.440500	0.233215	31.04969
5000	0.440500	0.229116	30.73605

Framework versions

Transformers 4.26.0
Pytorch 2.0.1+cu117
Datasets 2.8.0
Tokenizers 0.13.3