Edit model card

Whisper Medium GA-EN Speech Translation Raw

This model is a fine-tuned version of openai/whisper-medium on the IWSLT-2023, FLEURS, BiteSize, and SpokenWords dataset. It achieves the following results on the evaluation set:

  • Loss: 1.4321
  • Bleu: 30.23
  • Chrf: 48.18
  • Wer: 65.3760

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_ratio: 0.03
  • training_steps: 2000
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Bleu Chrf Wer
2.6013 0.0539 100 2.2401 3.18 17.57 139.4417
2.5749 0.1079 200 3.0398 0.0 3.87 100.4052
2.3449 0.1618 300 2.0560 7.53 24.09 121.0266
2.0392 0.2157 400 1.9721 10.7 29.63 109.7253
1.9155 0.2697 500 1.9402 16.73 31.59 81.9901
1.9148 0.3236 600 1.7868 11.12 32.9 117.1544
1.698 0.3776 700 1.7244 20.14 36.31 83.8811
1.7283 0.4315 800 1.6586 16.74 34.0 94.5070
1.5213 0.4854 900 1.6387 19.49 38.29 84.2413
1.3123 0.5394 1000 1.6292 22.27 41.45 80.2792
1.1584 0.5933 1100 1.5900 25.48 42.03 74.2008
1.1734 0.6472 1200 1.5495 17.77 40.1 106.9338
1.2271 0.7012 1300 1.4978 21.7 43.63 84.2413
1.0872 0.7551 1400 1.4690 25.34 43.98 74.2909
0.9331 0.8091 1500 1.4688 20.09 43.14 90.5448
0.7861 0.8630 1600 1.4284 26.49 46.76 76.4971
0.8392 0.9169 1700 1.3909 27.22 46.91 73.3904
0.7236 0.9709 1800 1.4349 26.98 46.01 74.2008
0.2741 1.0248 1900 1.4279 28.92 47.63 68.3476
0.2782 1.0787 2000 1.4321 30.23 48.18 65.3760

Framework versions

  • Transformers 4.41.2
  • Pytorch 2.2.0+cu121
  • Datasets 2.20.0
  • Tokenizers 0.19.1
Downloads last month
2
Safetensors
Model size
764M params
Tensor type
F32
·
Inference API
or
This model can be loaded on Inference API (serverless).

Finetuned from

Datasets used to train ymoslem/whisper-medium-ga2en-v1.3.0-r

Evaluation results

  • Bleu on IWSLT-2023, FLEURS, BiteSize, and SpokenWords
    self-reported
    30.230
  • Wer on IWSLT-2023, FLEURS, BiteSize, and SpokenWords
    self-reported
    65.376