---
license: cc-by-nc-4.0
tags:
- generated_from_trainer
datasets:
- common_voice_11_0
metrics:
- wer
base_model: nguyenvulebinh/wav2vec2-base-vietnamese-250h
model-index:
- name: model_weight
  results:
  - task:
      type: automatic-speech-recognition
      name: Automatic Speech Recognition
    dataset:
      name: common_voice_11_0
      type: common_voice_11_0
      config: vi
      split: None
      args: vi
    metrics:
    - type: wer
      value: 0.14013683555810727
      name: Wer
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# model_weight

This model is a fine-tuned version of [nguyenvulebinh/wav2vec2-base-vietnamese-250h](https://huggingface.co/nguyenvulebinh/wav2vec2-base-vietnamese-250h) on the common_voice_11_0 dataset.
It achieves the following results on the evaluation set:
- Loss: 0.1765
- Wer: 0.1401

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 0.0001
- train_batch_size: 32
- eval_batch_size: 8
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 1000
- num_epochs: 40
- mixed_precision_training: Native AMP

### Training results

| Training Loss | Epoch   | Step  | Validation Loss | Wer    |
|:-------------:|:-------:|:-----:|:---------------:|:------:|
| 15.0719       | 1.3928  | 500   | 4.8260          | 1.0    |
| 4.4273        | 2.7855  | 1000  | 4.6865          | 0.9991 |
| 3.9296        | 4.1783  | 1500  | 4.2965          | 0.9992 |
| 3.4964        | 5.5710  | 2000  | 2.6642          | 0.9583 |
| 2.8184        | 6.9638  | 2500  | 1.7146          | 0.8718 |
| 2.132         | 8.3565  | 3000  | 1.4549          | 0.7103 |
| 1.7481        | 9.7493  | 3500  | 0.9072          | 0.5730 |
| 1.5776        | 11.1421 | 4000  | 0.7414          | 0.5132 |
| 1.3743        | 12.5348 | 4500  | 0.6621          | 0.4089 |
| 1.2417        | 13.9276 | 5000  | 0.4884          | 0.3854 |
| 1.1375        | 15.3203 | 5500  | 0.3561          | 0.3123 |
| 1.0412        | 16.7131 | 6000  | 0.3344          | 0.2945 |
| 0.981         | 18.1058 | 6500  | 0.3063          | 0.2667 |
| 0.9913        | 19.4986 | 7000  | 0.2778          | 0.2244 |
| 0.861         | 20.8914 | 7500  | 0.2511          | 0.2170 |
| 0.8314        | 22.2841 | 8000  | 0.2498          | 0.2127 |
| 0.8669        | 23.6769 | 8500  | 0.2452          | 0.2048 |
| 0.8003        | 25.0696 | 9000  | 0.2251          | 0.1830 |
| 0.7409        | 26.4624 | 9500  | 0.2292          | 0.1820 |
| 0.7282        | 27.8552 | 10000 | 0.2130          | 0.1681 |
| 0.7675        | 29.2479 | 10500 | 0.2290          | 0.1796 |
| 0.7295        | 30.6407 | 11000 | 0.1971          | 0.1617 |
| 0.6308        | 32.0334 | 11500 | 0.2032          | 0.1555 |
| 0.6251        | 33.4262 | 12000 | 0.1905          | 0.1515 |
| 0.5887        | 34.8189 | 12500 | 0.1844          | 0.1481 |
| 0.6642        | 36.2117 | 13000 | 0.1796          | 0.1444 |
| 0.6068        | 37.6045 | 13500 | 0.1808          | 0.1417 |
| 0.5862        | 38.9972 | 14000 | 0.1765          | 0.1401 |


### Framework versions

- Transformers 4.40.0
- Pytorch 2.2.1+cu121
- Datasets 2.19.0
- Tokenizers 0.19.1