Edit model card

Whisper Small GA-EN Speech Translation, fine-tuned from 1.2, without SpokenWords

This model is a fine-tuned version of openai/whisper-small on the IWSLT-2023, FLEURS, and BiteSize dataset. It achieves the following results on the evaluation set:

  • Loss: 1.7177
  • Bleu: 29.96
  • Chrf: 45.61
  • Wer: 66.9968

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 32
  • eval_batch_size: 32
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 0.03
  • training_steps: 5000
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Bleu Chrf Validation Loss Wer
2.4954 0.11 100 3.7 18.03 2.1286 179.7839
2.045 0.22 200 12.65 25.53 1.8146 100.9005
1.7928 0.32 300 13.78 30.2 1.7253 101.9811
1.6615 0.43 400 15.8 31.88 1.6834 92.5259
1.4491 0.54 500 15.61 36.27 1.5971 107.3841
1.2074 0.65 600 19.92 36.31 1.5939 84.3314
1.2308 0.76 700 20.37 38.72 1.5234 84.8267
1.107 0.86 800 21.35 37.87 1.5460 82.8906
0.9491 0.97 900 21.06 40.74 1.5161 82.5754
0.384 1.08 1000 23.24 41.98 1.4927 82.2152
0.362 1.19 1100 23.19 42.24 1.5567 80.2792
0.3756 1.29 1200 27.83 43.8 1.5265 69.2481
0.3401 1.4 1300 21.79 41.66 1.5522 92.3908
0.3346 1.51 1400 24.61 42.15 1.5085 75.4615
0.3101 1.62 1500 26.67 43.41 1.4933 70.7789
0.3231 1.73 1600 27.95 42.82 1.4979 68.3026
0.2665 1.83 1700 28.5 43.76 1.4977 68.1225
0.2704 1.94 1800 28.15 43.87 1.5063 68.8429
0.0769 2.05 1900 25.76 43.22 1.5162 77.6227
0.0597 2.16 2000 25.04 43.15 1.5216 79.0635
0.0743 2.27 2100 27.85 44.43 1.5313 68.3926
0.0878 2.37 2200 27.54 43.96 1.5495 68.3476
0.0712 2.48 2300 28.28 44.39 1.5355 65.8712
0.0789 2.59 2400 28.64 44.75 1.5277 65.7812
0.073 2.7 2500 29.09 44.65 1.5327 65.7812
0.073 2.8 2600 25.26 43.44 1.5304 78.2981
0.0697 2.91 2700 25.71 43.02 1.5460 78.4782
0.0398 3.02 2800 28.26 44.71 1.5580 72.8501
0.0302 3.13 2900 30.25 45.46 1.5688 66.1414
0.0424 3.24 3000 29.88 45.21 1.5693 66.0964
0.0397 3.34 3100 30.01 45.85 1.5934 65.6911
0.0346 3.45 3200 30.2 45.8 1.5818 65.8262
0.032 3.56 3300 29.81 46.5 1.5823 66.7267
0.0348 3.67 3400 30.77 46.43 1.5752 64.6556
0.0522 5.97 3500 1.6080 29.69 45.47 65.8712
0.0443 6.14 3600 1.6272 29.54 44.71 65.1508
0.0492 6.31 3700 1.6211 29.3 45.36 68.3926
0.0544 6.48 3800 1.6069 30.08 44.39 64.9257
0.0574 6.66 3900 1.6306 28.86 44.6 66.2765
0.0535 6.83 4000 1.6722 27.92 43.48 67.9874
0.0424 7.0 4100 1.6968 27.48 44.29 70.3737
0.0235 7.17 4200 1.6768 27.97 45.34 70.0135
0.0262 7.34 4300 1.6908 28.77 45.74 68.3926
0.0218 7.51 4400 1.6890 28.97 46.57 69.5182
0.0293 7.68 4500 1.6742 29.51 45.38 68.8429
0.0194 7.85 4600 1.6962 29.63 45.18 67.9874
0.0187 8.02 4700 1.6936 30.1 45.28 66.0964
0.0115 8.19 4800 1.7162 30.0 46.02 67.6722
0.0138 8.36 4900 1.7113 30.34 46.01 66.6817
0.0098 8.53 5000 1.7177 29.96 45.61 66.9968

Framework versions

  • Transformers 4.39.3
  • Pytorch 2.2.1+cu121
  • Datasets 2.18.0
  • Tokenizers 0.15.2
Downloads last month
6
Safetensors
Model size
242M params
Tensor type
F32
·

Finetuned from

Datasets used to train ymoslem/whisper-small-ga2en-v1.3

Evaluation results