ymoslem's picture
Update README.md
6229cf8
metadata
language:
  - ga
  - en
license: apache-2.0
base_model: openai/whisper-large
tags:
  - generated_from_trainer
datasets:
  - ymoslem/IWSLT2023-GA-EN
  - ymoslem/FLEURS-GA-EN
  - ymoslem/BitesizeIrish-GA-EN
  - ymoslem/SpokenWords-GA-EN-MTed
metrics:
  - bleu
  - wer
model-index:
  - name: Whisper Large GA-EN Speech Translation
    results:
      - task:
          name: Automatic Speech Recognition
          type: automatic-speech-recognition
        dataset:
          name: IWSLT-2023, FLEURS, BiteSize, SpokenWords, Tatoeba, and Wikimedia
          type: ymoslem/IWSLT2023-GA-EN
        metrics:
          - name: Bleu
            type: bleu
            value: 30.16
          - name: Wer
            type: wer
            value: 69.968482665466

Whisper Large GA-EN Speech Translation

This model is a fine-tuned version of openai/whisper-large on the IWSLT-2023, FLEURS, BiteSize, SpokenWords, Tatoeba, and Wikimedia dataset. The datasets are augmented in two ways: noise augmentation, and truncating low-amplitude samples. The best model checkpoint (this version) based on ChrF is at step 3000, epoch 0.99, and it achieves the following results on the evaluation set:

  • Loss: 1.1742
  • Bleu: 30.16
  • Chrf: 50.72
  • Wer: 69.9685

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 16
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 0.03
  • training_steps: 3000
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Bleu Chrf Wer
3.1833 0.03 100 2.5169 2.03 16.8 215.5786
2.7632 0.07 200 2.1827 7.81 24.07 113.1022
2.5687 0.1 300 2.0746 6.16 24.2 158.8474
2.5615 0.13 400 1.9379 8.68 26.18 120.8465
2.4554 0.16 500 1.8932 12.14 28.94 103.1067
2.3546 0.2 600 1.8734 14.34 29.83 91.5353
2.2804 0.23 700 1.8075 13.18 33.07 105.5380
2.1408 0.26 800 1.7034 13.01 33.0 89.4642
2.0411 0.3 900 1.6556 16.73 34.97 91.4453
1.7766 0.33 1000 1.6505 17.21 35.54 83.5209
1.7704 0.36 1100 1.5800 17.54 38.11 77.1724
1.6537 0.39 1200 1.5684 14.2 35.39 95.6326
1.4841 0.43 1300 1.4970 22.96 39.35 71.3643
1.641 0.46 1400 1.4693 16.3 37.69 103.7821
1.393 0.49 1500 1.3923 27.21 43.87 69.3381
1.249 0.53 1600 1.3876 23.33 42.26 76.5421
1.3385 0.56 1700 1.3404 23.86 42.82 75.0563
1.2537 0.59 1800 1.3226 17.03 41.72 100.1801
1.2891 0.62 1900 1.2995 27.26 43.62 69.1580
1.226 0.66 2000 1.2605 30.89 47.34 63.5750
1.1268 0.69 2100 1.2783 27.43 45.97 67.4921
1.0007 0.72 2200 1.2521 27.21 47.25 71.0041
0.9565 0.76 2300 1.2219 31.65 48.07 64.2053
0.9309 0.79 2400 1.2193 31.4 48.18 64.1603
0.7923 0.82 2500 1.2099 28.88 48.89 69.7884
0.8199 0.85 2600 1.1972 29.37 48.07 67.3120
0.6974 0.89 2700 1.1857 29.7 48.95 70.5988
0.6736 0.92 2800 1.1884 29.33 48.97 72.7150
0.6826 0.95 2900 1.1834 30.76 50.11 68.1225
0.7001 0.99 3000 1.1742 30.16 50.72 69.9685

Framework versions

  • Transformers 4.39.3
  • Pytorch 2.0.1+cu118
  • Datasets 2.18.0
  • Tokenizers 0.15.2