Edit model card

distil-whisper-large-v3-tr

Model Description

distil-whisper-large-v3-tr is a distilled version of the Whisper model, fine-tuned for Turkish language tasks. This model has been trained and evaluated using a comprehensive dataset to achieve high accuracy in Turkish speech recognition.

Training and Evaluation Metrics

The model was trained and evaluated using the wandb tool, with the following results:

Evaluation Metrics

  • Cross-Entropy Loss (eval/ce_loss): 0.53218
  • Epoch (eval/epoch): 28
  • KL Loss (eval/kl_loss): 0.34883
  • Total Loss (eval/loss): 0.77457
  • Evaluation Time (eval/time): 397.1784 seconds
  • Word Error Rate (eval/wer): 14.43288%
  • Orthographic Word Error Rate (eval/wer_ortho): 21.55298%

Training Metrics

  • Cross-Entropy Loss (train/ce_loss): 0.04695
  • Epoch (train/epoch): 28
  • KL Loss (train/kl_loss): 0.24143
  • Learning Rate (train/learning_rate): 0.0001
  • Total Loss (train/loss): 0.27899
  • Training Time (train/time): 12426.92106 seconds

Run History

Overall Metrics

  • Real-Time Factor (all/rtf): 392.23396
  • Word Error Rate (all/wer): 14.33829

Common Voice 17.0 Turkish Pseudo-Labelled Dataset

  • Real-Time Factor (common_voice_17_0_tr_pseudo_labelled/test/rtf): 392.23396
  • Word Error Rate (common_voice_17_0_tr_pseudo_labelled/test/wer): 14.33829

Author

Sercan Çepni
Email: turkelf@gmail.com


For any questions or further information, please feel free to contact the author.

Downloads last month
11
Safetensors
Model size
756M params
Tensor type
F32
·

Finetuned from

Dataset used to train Sercan/distil-whisper-large-v3-tr