PThi35
/

whisper_large_v3

This model was trained from scratch on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.6966
  • Cer: 16.8717
  • Wer: 28.3301

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 16
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 1000
  • num_epochs: 2
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Cer Validation Loss Wer
1.8187 1.0 2111 45.3063 0.6914 66.9946
0.7214 2.0 4222 37.9743 0.6309 55.6871
0.5298 3.0 6333 29.0135 0.6026 45.6557
0.4164 4.0 8444 35.3396 0.6037 54.7338
0.3355 5.0 10555 27.3975 0.5956 42.6625
0.2702 6.0 12666 26.9102 0.6047 42.2039
0.2202 7.0 14777 21.7689 0.6023 35.8946
0.1806 8.0 16888 20.1071 0.6053 32.7984
0.1496 9.0 18999 20.3211 0.6262 33.2826
0.1227 10.0 21110 19.5237 0.6374 31.9854
0.1013 11.0 23221 18.4214 0.6532 30.6836
0.0859 12.0 25332 18.6292 0.6505 30.9128
0.0728 13.0 27443 19.0582 0.6658 31.7761
0.0629 14.0 29554 17.9456 0.6691 30.1198
0.0549 15.0 31665 17.5997 0.6693 29.5186
0.0479 16.0 33776 18.0434 0.6894 30.0882
0.043 17.0 35887 17.4846 0.6831 29.3805
0.0385 18.0 37998 17.9625 0.6906 29.9607
0.0344 19.0 40109 16.9491 0.6914 28.6015
0.0315 20.0 42220 16.9293 0.6968 28.3968
0.029 21.0 44331 16.8717 0.6966 28.3301
0.0254 22.0 46442 16.9834 0.6997 28.4600
0.0236 23.0 48553 16.9067 0.6967 28.3442
0.0227 24.0 50664 16.9137 0.7046 28.2646
0.0212 25.0 52775 16.8825 0.7053 28.2705

Framework versions

  • Transformers 4.55.4
  • Pytorch 2.8.0+cu126
  • Datasets 4.1.1
  • Tokenizers 0.21.4
Downloads last month
20
Safetensors
Model size
2B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support