Whisper base AR - BH

This model is a fine-tuned version of openai/whisper-base on the quran-ayat-speech-to-text dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0139
  • Wer: 9.5114
  • Cer: 2.8698

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 64
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_steps: 500
  • num_epochs: 10
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Cer Validation Loss Wer
0.0124 0.2895 800 6.9720 0.0166 21.3510
0.0076 0.5790 1600 4.4857 0.0124 14.3371
0.0042 0.8685 2400 4.2342 0.0112 13.1816
0.0053 1.1581 3200 4.8224 0.0133 14.4143
0.0041 1.4476 4000 4.0206 0.0121 12.9768
0.0023 1.7371 4800 3.7118 0.0116 11.9643
0.0022 2.0268 5600 4.0467 0.0125 12.7101
0.002 2.3163 6400 3.7803 0.0125 12.1962
0.0016 2.6058 7200 3.7763 0.0124 12.2696
0.0018 2.8952 8000 3.6627 0.0122 12.0570
0.0013 3.1849 8800 3.6893 0.0126 12.0957
0.0015 3.4744 9600 3.6893 0.0126 12.2232
0.0013 3.7639 10400 3.6023 0.0124 11.8561
0.0009 4.0536 11200 3.6514 0.0127 11.9836
0.0009 4.3430 12000 3.5554 0.0125 11.6976
0.0008 4.6325 12800 3.4661 0.0130 11.5585
0.0009 4.9220 13600 3.4242 0.0130 11.4735
0.0007 5.2117 14400 3.5752 0.0131 11.9102
0.0008 5.5012 15200 3.5531 0.0133 11.7595
0.0008 5.7907 16000 3.5058 0.0134 11.6358
0.0006 6.0803 16800 3.5428 0.0135 11.8290
0.0005 6.3698 17600 3.4418 0.0136 11.4851
0.0006 6.6593 18400 3.4526 0.0137 11.5392
0.0007 6.9488 19200 3.4477 0.0137 11.5160
0.0004 7.2385 20000 3.5631 0.0138 11.6667
0.0003 7.5280 20800 3.4923 0.0140 11.6435
0.0004 7.8174 21600 3.5216 0.0140 11.6822
0.0003 8.1071 22400 3.4522 0.0142 11.6204
0.0004 8.3966 23200 3.4639 0.0142 11.6590
0.0003 8.6861 24000 3.4927 0.0143 11.7015
0.0004 8.9756 24800 3.4977 0.0143 11.6861
0.0003 9.2652 25600 3.4400 0.0146 11.7440
0.0003 9.5547 26400 3.4954 0.0145 11.7904
0.0003 9.8442 27200 3.4896 0.0145 11.7672

Framework versions

  • Transformers 4.47.0
  • Pytorch 2.5.1+cu121
  • Datasets 3.3.1
  • Tokenizers 0.21.0
Downloads last month
8
Safetensors
Model size
72.6M params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for Baselhany/Whisper_base_Quran_GP

Finetuned
(462)
this model