Edit model card

model_weight

This model is a fine-tuned version of nguyenvulebinh/wav2vec2-base-vietnamese-250h on the common_voice_11_0 dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1765
  • Wer: 0.1401

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 32
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 1000
  • num_epochs: 40
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Wer
15.0719 1.3928 500 4.8260 1.0
4.4273 2.7855 1000 4.6865 0.9991
3.9296 4.1783 1500 4.2965 0.9992
3.4964 5.5710 2000 2.6642 0.9583
2.8184 6.9638 2500 1.7146 0.8718
2.132 8.3565 3000 1.4549 0.7103
1.7481 9.7493 3500 0.9072 0.5730
1.5776 11.1421 4000 0.7414 0.5132
1.3743 12.5348 4500 0.6621 0.4089
1.2417 13.9276 5000 0.4884 0.3854
1.1375 15.3203 5500 0.3561 0.3123
1.0412 16.7131 6000 0.3344 0.2945
0.981 18.1058 6500 0.3063 0.2667
0.9913 19.4986 7000 0.2778 0.2244
0.861 20.8914 7500 0.2511 0.2170
0.8314 22.2841 8000 0.2498 0.2127
0.8669 23.6769 8500 0.2452 0.2048
0.8003 25.0696 9000 0.2251 0.1830
0.7409 26.4624 9500 0.2292 0.1820
0.7282 27.8552 10000 0.2130 0.1681
0.7675 29.2479 10500 0.2290 0.1796
0.7295 30.6407 11000 0.1971 0.1617
0.6308 32.0334 11500 0.2032 0.1555
0.6251 33.4262 12000 0.1905 0.1515
0.5887 34.8189 12500 0.1844 0.1481
0.6642 36.2117 13000 0.1796 0.1444
0.6068 37.6045 13500 0.1808 0.1417
0.5862 38.9972 14000 0.1765 0.1401

Framework versions

  • Transformers 4.40.0
  • Pytorch 2.2.1+cu121
  • Datasets 2.19.0
  • Tokenizers 0.19.1
Downloads last month
0
Safetensors
Model size
94.5M params
Tensor type
F32
·

Finetuned from

Evaluation results