Edit model card

Whisper Large SSD superU

This model is a fine-tuned version of openai/whisper-large on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 4.4312
  • Wer: 120.4531

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 500
  • training_steps: 2000
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Wer
3.1192 3.125 100 2.9964 142.5654
1.7195 6.25 200 2.7478 122.2719
0.6545 9.375 300 3.0978 127.3772
0.1503 12.5 400 3.6068 139.0874
0.0827 15.625 500 3.6768 115.5712
0.0556 18.75 600 3.7650 114.9968
0.0441 21.875 700 3.7594 125.3350
0.0346 25.0 800 3.8227 147.7026
0.0205 28.125 900 3.9344 120.8998
0.0166 31.25 1000 3.9918 109.6682
0.0127 34.375 1100 3.9241 109.3491
0.009 37.5 1200 4.1503 110.7211
0.0029 40.625 1300 4.1240 134.5246
0.0007 43.75 1400 4.3018 105.5520
0.0007 46.875 1500 4.3464 106.8283
0.0004 50.0 1600 4.3809 115.5712
0.0003 53.125 1700 4.4061 120.8998
0.0002 56.25 1800 4.4205 120.2936
0.0002 59.375 1900 4.4289 120.4212
0.0002 62.5 2000 4.4312 120.4531

Framework versions

  • Transformers 4.46.2
  • Pytorch 2.5.1+cu124
  • Datasets 3.1.0
  • Tokenizers 0.20.3
Downloads last month
237
Safetensors
Model size
1.54B params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for shreyasdesaisuperU/whisper-large-large-attempt1

Finetuned
(46)
this model