Automatic Speech Recognition
Transformers
TensorBoard
Safetensors
Irish
English
whisper
generated_from_trainer
Eval Results
Inference Endpoints
Edit model card

Whisper Small GA-EN Speech Translation

This model is a fine-tuned version of openai/whisper-small on the IWSLT-2023, FLEURS, BiteSize, SpokenWords, Tatoeba, and Wikimedia dataset. The best model checkpoint (this version) based on ChrF is at step 2000, epoch 1.31, and it achieves the following results on the evaluation set:

  • Loss: 1.1571
  • Bleu: 30.25
  • Chrf: 48.12
  • Wer: 64.9707

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 32
  • eval_batch_size: 32
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 0.03
  • training_steps: 3000
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Bleu Chrf Validation Loss Wer
2.6685 0.07 100 5.05 20.18 2.0544 139.8919
2.4028 0.13 200 12.29 29.72 1.7367 95.5425
2.1231 0.2 300 14.33 30.77 1.6141 101.3958
1.9192 0.26 400 16.86 35.65 1.4778 91.0851
1.7129 0.33 500 16.77 37.53 1.3811 93.8766
1.5398 0.39 600 18.85 39.0 1.3427 90.2296
1.4257 0.46 700 25.73 43.3 1.2784 70.3287
1.3044 0.53 800 25.43 44.33 1.2274 72.3548
1.2626 0.59 900 25.09 44.62 1.1875 72.6249
1.2801 0.66 1000 25.68 45.53 1.1571 71.0491
1.2876 0.72 1100 20.62 41.49 1.2193 85.8622
1.2609 0.79 1200 29.47 45.04 1.2079 65.2859
1.187 0.85 1300 24.65 43.73 1.2086 72.9851
1.0342 0.92 1400 30.34 47.62 1.1766 64.3854
1.0519 0.98 1500 29.39 47.69 1.1425 64.9707
0.5473 1.05 1600 28.02 46.27 1.1842 67.6722
0.4886 1.12 1700 26.62 46.37 1.1845 76.4971
0.4354 1.18 1800 23.63 45.16 1.1621 86.1324
0.4709 1.25 1900 27.86 47.3 1.1544 73.7506
0.4802 1.31 2000 30.25 48.12 1.1571 64.9707
0.4565 1.38 2100 24.75 44.7 1.2095 77.4426
0.4797 1.44 2200 28.46 46.03 1.2051 67.1769
0.423 1.51 2300 28.34 47.65 1.2079 68.6177
0.4254 1.58 2400 27.78 46.01 1.2251 67.8523
0.4493 1.64 2500 26.61 47.8 1.1898 71.1391
0.3614 1.71 2600 30.08 47.25 1.2079 64.2954
0.4052 1.77 2700 30.88 47.44 1.1975 64.2053
0.3541 1.84 2800 28.4 46.02 1.2006 70.2837
0.3736 1.9 2900 30.82 47.52 1.1906 64.1153
0.3326 1.97 3000 27.57 46.72 1.1870 70.6439

Framework versions

  • Transformers 4.39.3
  • Pytorch 2.2.1+cu121
  • Datasets 2.18.0
  • Tokenizers 0.15.2
Downloads last month
14
Safetensors
Model size
242M params
Tensor type
F32
·

Finetuned from

Datasets used to train ymoslem/whisper-small-ga2en-v3.1

Collection including ymoslem/whisper-small-ga2en-v3.1

Evaluation results

  • Bleu on IWSLT-2023, FLEURS, BiteSize, SpokenWords, Tatoeba, and Wikimedia
    self-reported
    27.570
  • Wer on IWSLT-2023, FLEURS, BiteSize, SpokenWords, Tatoeba, and Wikimedia
    self-reported
    70.644