Edit model card

whisper-large-v2-quillr-yt

This model is a fine-tuned version of openai/whisper-large-v2 on the audiofolder dataset. It achieves the following results on the evaluation set:

  • Loss: 0.9478
  • Wer: 21.1612

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 32
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 10
  • num_epochs: 8
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Wer
1.4089 0.33 5 1.3416 28.8868
1.1392 0.67 10 0.9370 24.7121
0.7847 1.0 15 0.8187 22.8887
0.494 1.33 20 0.7803 21.7850
0.4915 1.67 25 0.7587 21.8330
0.4955 2.0 30 0.7590 21.7370
0.2731 2.33 35 0.7447 21.0653
0.2385 2.67 40 0.7610 20.8733
0.2255 3.0 45 0.7862 20.7774
0.13 3.33 50 0.8078 21.2092
0.1121 3.67 55 0.8140 20.9693
0.1307 4.0 60 0.8156 20.7774
0.0659 4.33 65 0.8636 21.1612
0.0563 4.67 70 0.8704 20.9693
0.0626 5.0 75 0.8657 20.4894
0.0394 5.33 80 0.8948 20.8253
0.0323 5.67 85 0.8978 21.0173
0.0392 6.0 90 0.8924 20.7774
0.0221 6.33 95 0.9137 21.3052
0.019 6.67 100 0.9430 21.1612
0.0182 7.0 105 0.9509 21.0653
0.0129 7.33 110 0.9489 20.8733
0.0127 7.67 115 0.9476 21.2092
0.0137 8.0 120 0.9478 21.1612

Framework versions

  • Transformers 4.37.2
  • Pytorch 2.3.0+cu121
  • Datasets 2.20.0
  • Tokenizers 0.15.2
Downloads last month
4
Safetensors
Model size
1.54B params
Tensor type
F32
·
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Finetuned from

Evaluation results