Whisper Medium Basque

This model is a fine-tuned version of openai/whisper-medium on the mozilla-foundation/common_voice_17_0 eu dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1787
  • Wer: 8.8021

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 6.25e-06
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 500
  • training_steps: 8000
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Wer
0.3171 0.0625 500 0.3369 25.5304
0.1852 0.125 1000 0.2409 17.3110
0.2353 0.1875 1500 0.2050 14.2228
0.1569 1.037 2000 0.1815 12.2861
0.125 1.0995 2500 0.1692 11.1144
0.12 1.162 3000 0.1600 10.6975
0.069 2.0115 3500 0.1540 9.7649
0.0606 2.074 4000 0.1550 9.8199
0.0434 2.1365 4500 0.1580 9.4571
0.0455 2.199 5000 0.1533 9.1410
0.0216 3.0485 5500 0.1620 9.0842
0.017 3.111 6000 0.1704 9.0980
0.0174 3.1735 6500 0.1681 9.0723
0.0098 4.023 7000 0.1725 8.8625
0.0076 4.0855 7500 0.1765 8.8351
0.007 4.148 8000 0.1787 8.8021

Framework versions

  • Transformers 4.46.0.dev0
  • Pytorch 2.4.1+cu121
  • Datasets 3.0.2.dev0
  • Tokenizers 0.20.0
Downloads last month
79
Safetensors
Model size
764M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for xezpeleta/whisper-medium-eu

Finetuned
(501)
this model

Dataset used to train xezpeleta/whisper-medium-eu

Evaluation results