Edit model card

Whisper Small GA-EN Speech Translation + VAD + warmup_ratio=0.01

This model is a fine-tuned version of openai/whisper-small on the IWSLT-2023, FLEURS, BiteSize, and SpokenWords dataset. It achieves the following results on the evaluation set:

  • Loss: 1.7482
  • Bleu: 29.94
  • Chrf: 45.74
  • Wer: 64.3404

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 64
  • eval_batch_size: 64
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_ratio: 0.01
  • training_steps: 3000
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Bleu Chrf Validation Loss Wer
2.0518 0.2188 100 8.56 25.29 1.8072 123.9982
1.5449 0.4376 200 18.41 34.82 1.5746 83.7461
1.2518 0.6565 300 21.1 36.24 1.5009 83.9712
1.0947 0.8753 400 21.5 41.43 1.4582 89.8694
0.4439 1.0941 500 25.21 41.77 1.4979 72.5799
0.4416 1.3129 600 22.2 40.47 1.5107 79.8739
0.4417 1.5317 700 20.2 40.75 1.5215 88.8789
0.4108 1.7505 800 25.73 41.28 1.5278 67.8073
0.355 1.9694 900 20.6 39.37 1.5436 87.3030
0.1303 2.1882 1000 28.79 42.68 1.5936 68.1675
0.1421 2.4070 1100 27.84 42.58 1.5745 67.5371
0.1341 2.6258 1200 30.52 45.15 1.5953 66.5916
0.1365 2.8446 1300 26.93 43.72 1.6046 74.2909
0.0528 3.0635 1400 29.03 44.12 1.6303 64.8807
0.0519 3.2823 1500 27.75 44.34 1.6774 68.6177
0.0554 3.5011 1600 27.64 45.15 1.6637 71.1842
0.0514 3.7199 1700 30.26 44.62 1.6497 65.4660
0.0503 3.9387 1800 26.88 43.0 1.6780 70.4187
0.0259 4.1575 1900 29.6 44.51 1.6915 64.9707
0.0263 4.3764 2000 25.33 42.51 1.7080 72.3998
0.0254 4.5952 2100 30.59 45.35 1.6884 64.2954
0.0211 4.8140 2200 31.09 46.56 1.6984 64.0252
0.0137 5.0328 2300 28.96 43.67 1.7253 66.3665
0.0075 5.2516 2400 29.77 44.63 1.7112 66.9968
0.0056 5.4705 2500 29.96 45.51 1.7197 64.5655
0.0067 5.6893 2600 29.86 45.25 1.7464 66.0964
0.0064 5.9081 2700 29.47 45.36 1.7440 65.2859
0.0023 6.1269 2800 30.03 46.49 1.7419 64.4755
0.0016 6.3457 2900 29.76 45.64 1.7474 65.0158
0.0019 6.5646 3000 1.7482 29.94 45.74 64.3404

Framework versions

  • Transformers 4.41.2
  • Pytorch 2.2.0+cu121
  • Datasets 2.19.1
  • Tokenizers 0.19.1
Downloads last month
10
Safetensors
Model size
242M params
Tensor type
F32
·

Finetuned from

Datasets used to train ymoslem/whisper-small-ga2en-v1.6-r

Evaluation results

  • Bleu on IWSLT-2023, FLEURS, BiteSize, and SpokenWords
    self-reported
    29.940
  • Wer on IWSLT-2023, FLEURS, BiteSize, and SpokenWords
    self-reported
    64.340