Edit model card

finetune_teacher_babble_noise_mozilla_200_epochs

This model is a fine-tuned version of on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 71.8264
  • Wer: 0.3574

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0005
  • train_batch_size: 4
  • eval_batch_size: 1
  • seed: 42
  • gradient_accumulation_steps: 256
  • total_train_batch_size: 1024
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.2
  • num_epochs: 200
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Wer
149.8494 14.7 1000 41.8514 0.3998
101.5704 29.41 2000 41.9244 0.3942
87.7921 44.12 3000 44.8273 0.4013
74.0441 58.82 4000 48.9263 0.3976
61.9751 73.53 5000 48.6313 0.3950
51.4311 88.23 6000 52.6974 0.3915
42.7197 102.94 7000 51.2589 0.3862
35.5205 117.64 8000 57.6496 0.3841
29.2148 132.35 9000 64.6558 0.3745
24.4399 147.06 10000 62.6512 0.3692
20.5101 161.76 11000 67.4978 0.3625
18.0444 176.47 12000 72.0740 0.3584
16.681 191.18 13000 71.8264 0.3574

Framework versions

  • Transformers 4.24.0
  • Pytorch 1.12.1
  • Datasets 2.7.1
  • Tokenizers 0.11.0
Downloads last month
8

Space using rohitp1/finetune_teacher_babble_noise_mozilla_200_epochs 1