Edit model card

model_weight_1

This model is a fine-tuned version of nguyenvulebinh/wav2vec2-base-vietnamese-250h on the common_voice_11_0 dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1739
  • Wer: 0.1265

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 32
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 1000
  • num_epochs: 40
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Wer
14.4773 1.3928 500 5.1397 1.0002
4.442 2.7855 1000 5.1727 1.0
3.9171 4.1783 1500 3.4650 0.9913
3.2597 5.5710 2000 2.1658 0.8943
2.5676 6.9638 2500 1.4240 0.7346
2.0229 8.3565 3000 0.9604 0.5685
1.6744 9.7493 3500 0.9651 0.4661
1.4788 11.1421 4000 0.7943 0.4500
1.3045 12.5348 4500 0.6500 0.3282
1.3199 13.9276 5000 0.4307 0.3130
1.1017 15.3203 5500 0.7321 0.2742
1.0042 16.7131 6000 0.9041 0.2408
1.0219 18.1058 6500 0.6662 0.2374
0.9303 19.4986 7000 0.7430 0.2171
0.8425 20.8914 7500 1.5198 0.1954
0.8409 22.2841 8000 0.6491 0.1982
0.881 23.6769 8500 0.6060 0.1734
0.8061 25.0696 9000 0.4495 0.1607
0.7404 26.4624 9500 0.6027 0.1630
0.713 27.8552 10000 0.5014 0.1542
0.7678 29.2479 10500 0.2076 0.1491
0.7059 30.6407 11000 0.2030 0.1497
0.6873 32.0334 11500 0.5304 0.1390
0.6471 33.4262 12000 0.4658 0.1378
0.6007 34.8189 12500 0.1836 0.1365
0.6758 36.2117 13000 0.1798 0.1314
0.6231 37.6045 13500 0.1793 0.1312
0.6034 38.9972 14000 0.1739 0.1265

Framework versions

  • Transformers 4.40.2
  • Pytorch 2.2.1+cu121
  • Datasets 2.19.1
  • Tokenizers 0.19.1
Downloads last month
5
Safetensors
Model size
94.5M params
Tensor type
F32
·

Finetuned from

Evaluation results