Edit model card

Whisper Small GA-EN Speech Translation

This model is a fine-tuned version of openai/whisper-small on the IWSLT-2023, FLEURS, BiteSize, and SpokenWords dataset. It achieves the following results on the evaluation set:

  • Loss: 1.8231
  • Bleu: 29.51
  • Chrf: 44.29
  • Wer: 67.0869

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 64
  • eval_batch_size: 64
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • training_steps: 4000
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Bleu Chrf Validation Loss Wer
1.9416 0.2155 100 13.09 26.48 1.7899 104.4575
1.5186 0.4310 200 18.6 35.75 1.5696 87.5732
1.2884 0.6466 300 17.57 37.19 1.4751 87.2580
1.0729 0.8621 400 17.92 38.23 1.4345 99.2346
0.4574 1.0776 500 22.48 39.17 1.5585 83.1607
0.4517 1.2931 600 22.53 38.38 1.5763 81.7650
0.4385 1.5086 700 20.05 39.46 1.5852 96.8483
0.3934 1.7241 800 26.89 42.67 1.5332 70.6889
0.3587 1.9397 900 28.95 44.16 1.5025 64.9707
0.1528 2.1552 1000 28.32 42.36 1.5882 65.8712
0.1425 2.3707 1100 25.5 42.42 1.6056 75.0113
0.1389 2.5862 1200 26.52 42.11 1.6236 70.6439
0.1532 2.8017 1300 25.78 41.61 1.6196 75.9118
0.1138 3.0172 1400 26.01 40.88 1.7185 69.6983
0.0661 3.2328 1500 28.74 43.16 1.6626 71.2292
0.0625 3.4483 1600 29.16 43.6 1.6835 66.3215
0.0615 3.6638 1700 28.93 44.08 1.6756 68.3476
0.0611 3.8793 1800 27.77 43.67 1.6648 72.1747
0.0344 4.0948 1900 28.33 44.18 1.7351 68.1225
0.0339 4.3103 2000 28.9 42.98 1.7715 67.0869
0.0369 4.5259 2100 29.83 44.87 1.7200 64.8807
0.0326 4.7414 2200 28.23 43.75 1.7232 69.3832
0.0346 4.9569 2300 27.72 43.1 1.7688 72.8050
0.0167 5.1724 2400 28.73 43.26 1.8072 67.4471
0.0146 5.3879 2500 29.91 44.24 1.7801 66.4566
0.0165 5.6034 2600 29.34 44.33 1.7782 68.2125
0.0143 5.8190 2700 27.78 43.07 1.7675 72.5799
0.0106 6.0345 2800 29.45 43.31 1.7660 67.5371
0.0098 6.25 2900 27.89 42.67 1.7803 71.6344
0.0087 6.4655 3000 27.66 43.04 1.7786 72.0396
0.0089 6.6810 3100 1.7661 29.81 44.65 67.3120
0.0081 6.8966 3200 1.7744 29.48 44.3 68.0324
0.0095 7.1121 3300 1.8197 29.55 44.2 67.5371
0.0112 7.3276 3400 1.8102 29.34 43.9 66.2765
0.0075 7.5431 3500 1.8004 29.57 44.43 67.3570
0.0111 7.7586 3600 1.8015 29.56 44.57 66.4566
0.009 7.9741 3700 1.8001 29.7 45.24 66.6817
0.005 8.1897 3800 1.8184 29.21 44.4 67.4471
0.0055 8.4052 3900 1.8222 29.67 44.35 67.1319
0.0042 8.6207 4000 1.8231 29.51 44.29 67.0869

Framework versions

  • Transformers 4.40.2
  • Pytorch 2.2.0+cu121
  • Datasets 2.19.1
  • Tokenizers 0.19.1
Downloads last month
1
Safetensors
Model size
242M params
Tensor type
F32
·

Finetuned from

Datasets used to train ymoslem/whisper-small-ga2en-v1.2.1-r

Evaluation results

  • Bleu on IWSLT-2023, FLEURS, BiteSize, and SpokenWords
    self-reported
    29.510
  • Wer on IWSLT-2023, FLEURS, BiteSize, and SpokenWords
    self-reported
    67.087