Edit model card

Whisper Base Hu v2

This model is a fine-tuned version of openai/whisper-base on the Common Voice 16.0 dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1599
  • Wer Ortho: 12.6641
  • Wer: 11.4171

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2.75e-05
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: constant_with_warmup
  • lr_scheduler_warmup_steps: 500
  • training_steps: 15000
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Wer Ortho Wer
0.199 0.33 1000 0.3838 36.7548 33.5517
0.3037 0.67 2000 0.3131 31.2748 28.3664
0.221 1.0 3000 0.2546 27.1739 24.1773
0.1562 1.34 4000 0.2319 23.9341 21.3341
0.1623 1.67 5000 0.2101 21.4079 18.9623
0.077 2.01 6000 0.1818 18.5415 16.2852
0.078 2.34 7000 0.1846 17.8339 15.7456
0.0818 2.68 8000 0.1712 16.4669 14.5983
0.0352 3.01 9000 0.1669 15.6178 14.0676
0.0413 3.35 10000 0.1673 14.9464 13.4539
0.0454 3.68 11000 0.1649 14.5459 12.7542
0.0225 4.02 12000 0.1589 13.5885 12.2087
0.0269 4.35 13000 0.1638 14.3864 12.8343
0.0299 4.69 14000 0.1621 13.0555 11.7610
0.0171 5.02 15000 0.1599 12.6641 11.4171

Framework versions

  • Transformers 4.36.2
  • Pytorch 2.1.0+cu121
  • Datasets 2.16.1
  • Tokenizers 0.15.0
Downloads last month
0
Safetensors
Model size
72.6M params
Tensor type
F32
·

Finetuned from

Dataset used to train Hungarians/whisper-base-cv16-hu-v2