Edit model card

Nystrom-W2V2-100hrs-take-1

This model is a fine-tuned version of on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 341.3232
  • Wer: 1.0383

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.002
  • train_batch_size: 4
  • eval_batch_size: 1
  • seed: 42
  • gradient_accumulation_steps: 16
  • total_train_batch_size: 64
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_ratio: 0.2
  • num_epochs: 30
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Wer
2303.9713 1.12 500 310.4888 1.0
2124.0935 2.25 1000 308.2970 1.0
2124.3255 3.37 1500 308.7900 1.0
2117.8083 4.49 2000 304.9168 1.0004
2109.4828 5.62 2500 302.9160 0.9822
2110.5865 6.74 3000 301.8441 0.9815
2112.032 7.86 3500 299.5007 1.0203
2123.3007 8.99 4000 300.7238 1.0462
2104.799 10.11 4500 322.6273 1.0285
2115.346 11.24 5000 324.5243 1.0000
2109.4525 12.36 5500 325.2796 0.9906
2106.9338 13.48 6000 296.8322 1.0662
2096.3611 14.61 6500 300.2722 1.1456
2072.9336 15.73 7000 297.6896 1.1467
2070.3176 16.85 7500 298.6618 1.0967
2037.9215 17.98 8000 341.3232 1.0383

Framework versions

  • Transformers 4.24.0
  • Pytorch 1.12.1
  • Datasets 2.7.1
  • Tokenizers 0.11.0
Downloads last month
1
Unable to determine this model’s pipeline type. Check the docs .