arampacha's picture
iter 1
f3f904f
metadata
license: apache-2.0
tags:
  - automatic-speech-recognition
  - /workspace/data/uk/noizy_student_1/
  - generated_from_trainer
model-index:
  - name: ''
    results: []

This model is a fine-tuned version of facebook/wav2vec2-xls-r-1b on the /WORKSPACE/DATA/UK/NOIZY_STUDENT_1/ - NA dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1285
  • Wer: 0.1821
  • Cer: 0.0342

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 16
  • eval_batch_size: 64
  • seed: 42
  • gradient_accumulation_steps: 8
  • total_train_batch_size: 128
  • optimizer: Adam with betas=(0.9,0.98) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • training_steps: 10000
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Cer Validation Loss Wer
1.2323 3.22 500 0.0797 0.2816 0.4133
0.9826 6.45 1000 0.0514 0.1970 0.2688
0.8628 9.67 1500 0.0474 0.1649 0.2485
0.8348 12.9 2000 0.0467 0.1605 0.2460
0.8186 16.13 2500 0.0469 0.1608 0.2469
0.8011 19.35 3000 0.1620 0.2412 0.0468
0.807 22.58 3500 0.1737 0.2524 0.0498
0.7758 25.8 4000 0.1709 0.2536 0.0498
0.7923 29.03 4500 0.1645 0.2436 0.0474
0.7717 32.26 5000 0.1811 0.2636 0.0524
0.7447 35.48 5500 0.1635 0.2405 0.0468
0.7267 38.71 6000 0.1578 0.2354 0.0462
0.7046 41.93 6500 0.1555 0.2296 0.0444
0.6896 45.16 7000 0.1548 0.2272 0.0439
0.6575 48.38 7500 0.1432 0.2096 0.0399
0.6264 51.61 8000 0.1466 0.2056 0.0398
0.589 54.83 8500 0.1351 0.1943 0.0371
0.573 58.06 9000 0.1387 0.1934 0.0365
0.5537 61.29 9500 0.1328 0.1883 0.0353
0.544 64.51 10000 0.1285 0.1821 0.0342

Framework versions

  • Transformers 4.17.0.dev0
  • Pytorch 1.10.2
  • Datasets 1.18.4.dev0
  • Tokenizers 0.11.0