Edit model card

Whisper base arabic

It achieves the following results on the evaluation set:

  • Loss: 0.44
  • Wer: 34.7

Training and evaluation data

Train set:

  • mozilla-foundation/common_voice_16_0 ar [train+validation]
  • BelalElhossany/mgb2_audios_transcriptions_non_overlap
  • nadsoft/Jordan-Audio

cross validation set: 600 samples in total from the 3 sets to save time during training as colab free tier was used to train the model. note: evaluate accuracy in the way you see fit.

Training procedure

removed arabic (حركات) from the texts. trained the model on the combined dataset for 6 epochs, the best one being the fifth so the model is basically the 5th epoch.

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 32
  • eval_batch_size: 16
  • seed: 42
  • gradient_accumulation_steps: 1
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 500

Training results

Training Loss Epoch Step Validation Loss Wer
0.4603 1 1437 0.4931 45.8857
0.2867 2 2874 0.4493 36.9973
0.2494 3 4311 0.4219 43.5553
0.1435 4 5748 0.4408 40.2351
0.1345 5 7185 0.4407 34.7081
Downloads last month
6
Safetensors
Model size
72.6M params
Tensor type
F32
·

Finetuned from

Datasets used to train YazanSalameh/Whisper-base-Arabic