Automatic Speech Recognition
Transformers
TensorBoard
Safetensors
Irish
English
whisper
Generated from Trainer
Eval Results
Inference Endpoints
Edit model card

Whisper Medium GA-EN Speech Translation

This model is a fine-tuned version of openai/whisper-small on the IWSLT-2023, FLEURS, BiteSize, SpokenWords, Tatoeba, Wikimedia, and EUbookshop dataset. It achieves the following results on the evaluation set:

  • Loss: 1.0885
  • Bleu: 30.86
  • Chrf: 54.11
  • Wer: 67.0419

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_ratio: 0.03
  • training_steps: 8000
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Bleu Chrf Validation Loss Wer
2.5374 0.0138 100 2.56 18.92 2.1201 222.4674
2.446 0.0276 200 3.07 20.56 2.1960 170.5088
2.2819 0.0414 300 5.87 25.17 1.9811 114.5880
2.1904 0.0552 400 8.41 25.65 1.9974 99.1896
2.026 0.0690 500 7.99 27.64 1.8961 130.7069
2.0448 0.0828 600 9.15 27.78 1.9410 104.9077
1.8606 0.0966 700 9.57 29.34 1.8451 110.4908
1.9887 0.1103 800 13.44 32.32 1.7419 84.3314
1.8633 0.1241 900 13.43 31.58 1.7376 102.1162
1.7576 0.1379 1000 11.9 32.68 1.6879 106.6186
1.7142 0.1517 1100 12.4 33.66 1.7571 102.6114
1.7168 0.1655 1200 17.35 36.55 1.6003 87.9784
1.6741 0.1793 1300 15.41 35.46 1.5883 92.8411
1.6534 0.1931 1400 17.12 37.24 1.5366 90.2296
1.58 0.2069 1500 17.49 38.5 1.5141 92.1207
1.403 0.2207 1600 16.78 39.13 1.4606 88.9689
1.3806 0.2345 1700 19.26 40.02 1.4263 86.7177
1.5111 0.2483 1800 18.4 39.47 1.4060 92.2557
1.4261 0.2621 1900 21.19 42.13 1.3911 78.7033
1.2974 0.2759 2000 15.6 38.66 1.3871 100.3152
1.2694 0.2897 2100 16.21 39.99 1.3527 91.2652
1.204 0.3034 2200 20.2 41.18 1.3232 86.8978
1.1922 0.3172 2300 16.44 40.85 1.3338 103.1968
1.1237 0.3310 2400 19.29 43.73 1.2830 94.4620
1.0989 0.3448 2500 25.11 46.84 1.2844 75.0563
1.0766 0.3586 2600 23.87 46.1 1.2578 74.5160
1.0432 0.3724 2700 22.31 44.91 1.2414 86.9878
1.1588 0.3862 2800 23.32 45.94 1.2051 77.1724
1.0062 0.4 2900 26.15 48.27 1.2059 69.4282
0.9178 0.4138 3000 29.13 48.92 1.1756 64.7456
0.9108 0.4276 3100 28.34 48.9 1.1665 67.2220
0.9868 0.4414 3200 25.64 48.93 1.1489 75.3264
0.9563 0.4552 3300 27.58 49.67 1.1181 71.8145
0.9138 0.4690 3400 28.37 50.96 1.1247 71.4543
0.8508 0.4828 3500 29.75 51.41 1.1007 68.3476
0.836 0.4966 3600 30.99 52.2 1.1114 66.5916
0.8435 0.5103 3700 30.64 52.77 1.0782 68.2125
0.8323 0.5241 3800 29.78 52.94 1.0744 68.9779
0.818 0.5379 3900 31.23 53.21 1.0639 67.7623
0.8095 0.5517 4000 31.02 53.51 1.0576 68.5277
0.922 0.5655 4100 1.2445 25.47 46.16 74.2909
1.0387 0.5793 4200 1.2634 25.44 46.19 71.0491
0.9386 0.5931 4300 1.2457 22.36 45.4 76.8122
0.9297 0.6069 4400 1.2502 28.65 46.48 65.7362
0.9837 0.6207 4500 1.2503 26.81 46.53 68.9779
1.0226 0.6345 4600 1.2282 19.37 44.1 86.4926
0.9896 0.6483 4700 1.2568 26.06 46.46 70.8240
0.9805 0.6621 4800 1.2364 19.29 42.56 82.0351
0.8982 0.6759 4900 1.2346 28.58 47.84 64.6556
0.8303 0.6897 5000 1.2136 27.25 48.15 68.3476
0.905 0.7034 5100 1.1808 27.99 50.31 67.2220
0.8125 0.7172 5200 1.1971 28.91 47.63 65.4660
0.7965 0.7310 5300 1.1789 25.96 47.21 69.5633
0.8244 0.7448 5400 1.2237 28.65 48.63 66.6367
0.7637 0.7586 5500 1.1765 30.4 50.24 66.6817
0.7333 0.7724 5600 1.1295 29.94 51.34 68.8879
0.8141 0.7862 5700 1.1238 27.51 50.61 74.7861
0.6969 0.8 5800 1.1350 23.95 48.76 87.6632
0.7162 0.8138 5900 1.1493 26.34 48.65 74.0207
0.7421 0.8276 6000 1.0976 28.69 52.23 68.5727
0.593 0.8414 6100 1.1163 34.96 53.13 59.3426
0.678 0.8552 6200 1.1072 34.14 53.2 61.6839
0.6018 0.8690 6300 1.0959 31.8 53.33 64.1153
0.6038 0.8828 6400 1.0959 24.77 50.61 84.2413
0.6174 0.8966 6500 1.0891 25.48 50.6 81.6749
0.595 0.9103 6600 1.1037 23.83 48.07 83.3859
0.6114 0.9241 6700 1.0723 28.03 52.18 70.7789
0.6257 0.9379 6800 1.0797 33.13 52.95 61.5038
0.6689 0.9517 6900 1.0803 30.53 52.41 68.4376
0.4908 0.9655 7000 1.0901 30.1 51.71 69.1130
0.5439 0.9793 7100 1.0672 25.81 49.36 76.5871
0.5994 0.9931 7200 1.0705 31.56 52.51 66.1414
0.2451 1.0069 7300 1.1069 33.0 53.29 64.7006
0.2609 1.0207 7400 1.0877 31.68 54.3 64.9257
0.2813 1.0345 7500 1.0910 34.93 54.74 60.1531
0.2367 1.0483 7600 1.0999 30.87 53.09 65.9163
0.2018 1.0621 7700 1.0917 35.53 54.42 58.7573
0.2407 1.0759 7800 1.0859 34.38 54.5 60.9185
0.2385 1.0897 7900 1.0866 31.27 54.12 65.3309
0.2074 1.1034 8000 1.0885 30.86 54.11 67.0419

Framework versions

  • Transformers 4.41.2
  • Pytorch 2.2.0+cu121
  • Datasets 2.20.0
  • Tokenizers 0.19.1
Downloads last month
3
Safetensors
Model size
764M params
Tensor type
F32
·

Finetuned from

Datasets used to train ymoslem/whisper-medium-ga2en-v6.3.1-r

Evaluation results

  • Bleu on IWSLT-2023, FLEURS, BiteSize, SpokenWords, Tatoeba, Wikimedia, and EUbookshop
    self-reported
    30.860
  • Wer on IWSLT-2023, FLEURS, BiteSize, SpokenWords, Tatoeba, Wikimedia, and EUbookshop
    self-reported
    67.042