Edit model card

dgx1_whisper_base_libri360_noisy_teacher_distil_epochs_50_batch_8

This model is a fine-tuned version of rohitp1/subhadeep_whisper_base_finetune_teacher_babble_noise_libri_360_hours_100_epochs_batch_8 on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.5761
  • Wer: 10.6733

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 8
  • eval_batch_size: 1
  • seed: 42
  • gradient_accumulation_steps: 256
  • total_train_batch_size: 2048
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine_with_restarts
  • lr_scheduler_warmup_ratio: 0.2
  • num_epochs: 50
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Wer
0.0423 1.48 150 0.1620 10.8902
0.0999 2.96 300 0.2030 10.7882
0.1577 4.45 450 0.2511 10.7937
0.2078 5.94 600 0.2966 10.7827
0.252 7.42 750 0.3321 10.7524
0.2841 8.91 900 0.3625 10.7588
0.3189 10.39 1050 0.3858 10.7772
0.341 11.88 1200 0.4090 10.7505
0.5277 13.36 1350 0.5461 11.1926
0.8342 14.85 1500 0.5250 10.8415
0.8278 16.33 1650 0.5543 10.7478
0.8255 17.82 1800 0.5481 10.6761
0.822 19.31 1950 0.5504 10.6650
0.8204 20.79 2100 0.5556 10.6650
0.8246 22.28 2250 0.5598 10.6586
0.8228 23.76 2400 0.5634 10.6770
0.8282 25.25 2550 0.5670 10.6706
0.8264 26.73 2700 0.5702 10.6752
0.8298 28.22 2850 0.5731 10.6908
0.8273 29.7 3000 0.5761 10.6733

Framework versions

  • Transformers 4.25.1
  • Pytorch 1.12.1
  • Datasets 2.8.0
  • Tokenizers 0.13.2
Downloads last month
1