Automatic Speech Recognition
Transformers
TensorBoard
Safetensors
Irish
English
whisper
generated_from_trainer
Inference Endpoints
Edit model card

whisper-small-ga2en-v3.2-r

This model is a fine-tuned version of openai/whisper-small on the IWSLT-2023, FLEURS, BiteSize, SpokenWords, Tatoeba, and Wikimedia datasets. The best model checkpoint (this version) based on ChrF is at step 2700, epoch 3.5433, and it achieves the following results on the evaluation set:

  • Loss: 1.4313
  • Bleu: 30.87
  • Chrf: 47.72
  • Wer: 64.2954

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Hardware

1 NVIDIA A100-SXM4-80GB

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 64
  • eval_batch_size: 64
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • training_steps: 3000
  • mixed_precision_training: Native AMP

Training results

Step Training Loss Validation Loss Bleu Chrf Wer
100 2.311900 1.773697 9.20 28.23 120.486267
200 1.870000 1.479052 16.90 33.01 83.701036
300 1.627700 1.372679 20.33 38.68 84.061234
400 1.460400 1.309611 24.52 40.37 74.696083
500 1.336300 1.283173 20.35 40.77 86.537596
600 1.193300 1.255632 20.63 41.37 95.632598
700 1.015600 1.251285 21.24 41.42 82.170194
800 0.518300 1.292586 28.08 44.76 66.951824
900 0.467300 1.329429 25.16 42.93 76.316974
1000 0.438700 1.330984 28.29 46.08 67.672220
1100 0.403600 1.300828 27.43 46.32 68.977938
1200 0.379500 1.323791 30.02 45.48 63.800090
1300 0.337100 1.327949 30.40 47.61 61.999100
1400 0.288000 1.359497 28.13 44.60 66.501576
1500 0.265100 1.355470 26.58 45.51 71.319226
1600 0.100800 1.400149 26.19 46.02 72.985142
1700 0.092300 1.383455 24.83 46.18 77.532643
1800 0.103900 1.404863 22.19 43.19 88.743809
1900 0.090100 1.402833 29.73 45.85 66.186403
2000 0.084200 1.418717 28.18 45.29 73.570464
2100 0.074800 1.461650 26.58 44.66 74.020711
2200 0.072000 1.400547 31.01 47.30 61.143629
2300 0.042900 1.424147 28.72 45.53 65.511031
2400 0.025200 1.412174 27.18 47.19 74.020711
2500 0.026500 1.438945 30.01 46.73 65.105808
2600 0.023300 1.454140 30.93 46.65 62.404322
2700 0.021600 1.431275 30.87 47.72 64.295362
2800 0.019200 1.439022 30.50 46.98 65.150833
2900 0.018200 1.439916 31.09 47.27 63.529941
3000 0.019000 1.444545 30.83 47.35 64.205313

Framework versions

  • Transformers 4.40.2
  • Pytorch 2.2.0+cu121
  • Datasets 2.19.1
  • Tokenizers 0.19.1
Downloads last month
8
Safetensors
Model size
242M params
Tensor type
F32
·

Finetuned from

Datasets used to train ymoslem/whisper-small-ga2en-v3.2-r