File size: 3,610 Bytes
---
license: apache-2.0
base_model: facebook/wav2vec2-xls-r-300m
tags:
- generated_from_trainer
datasets:
- common_voice_13_0
metrics:
- wer
model-index:
- name: output
  results:
  - task:
      name: Automatic Speech Recognition
      type: automatic-speech-recognition
    dataset:
      name: common_voice_13_0
      type: common_voice_13_0
      config: hi
      split: test
      args: hi
    metrics:
    - name: Wer
      type: wer
      value: 1.019918009027289
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# output

This model is a fine-tuned version of [facebook/wav2vec2-xls-r-300m](https://huggingface.co/facebook/wav2vec2-xls-r-300m) on the common_voice_13_0 dataset.
It achieves the following results on the evaluation set:
- Loss: 0.7883
- Wer: 1.0199

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 0.0003
- train_batch_size: 2
- eval_batch_size: 8
- seed: 42
- gradient_accumulation_steps: 8
- total_train_batch_size: 16
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 500
- num_epochs: 30

### Training results

| Training Loss | Epoch | Step  | Validation Loss | Wer    |
|:-------------:|:-----:|:-----:|:---------------:|:------:|
| 5.92          | 0.95  | 400   | 2.9522          | 1.0026 |
| 1.0435        | 1.89  | 800   | 0.8608          | 1.0552 |
| 0.5354        | 2.84  | 1200  | 0.7762          | 1.0169 |
| 0.404         | 3.79  | 1600  | 0.6984          | 1.0293 |
| 0.3301        | 4.73  | 2000  | 0.6811          | 1.0217 |
| 0.2745        | 5.68  | 2400  | 0.7027          | 1.0308 |
| 0.2346        | 6.63  | 2800  | 0.7296          | 1.0185 |
| 0.2096        | 7.57  | 3200  | 0.7148          | 1.0294 |
| 0.1912        | 8.52  | 3600  | 0.7109          | 1.0335 |
| 0.172         | 9.47  | 4000  | 0.7894          | 1.0252 |
| 0.1567        | 10.41 | 4400  | 0.7592          | 1.0219 |
| 0.1457        | 11.36 | 4800  | 0.8030          | 1.0141 |
| 0.1337        | 12.31 | 5200  | 0.7811          | 1.0237 |
| 0.1288        | 13.25 | 5600  | 0.7703          | 1.0188 |
| 0.1165        | 14.2  | 6000  | 0.7728          | 1.0199 |
| 0.105         | 15.15 | 6400  | 0.7934          | 1.0206 |
| 0.1028        | 16.09 | 6800  | 0.7978          | 1.0185 |
| 0.092         | 17.04 | 7200  | 0.8276          | 1.0289 |
| 0.0901        | 17.99 | 7600  | 0.7881          | 1.0202 |
| 0.0818        | 18.93 | 8000  | 0.7847          | 1.0162 |
| 0.0801        | 19.88 | 8400  | 0.8142          | 1.0230 |
| 0.0768        | 20.83 | 8800  | 0.7735          | 1.0215 |
| 0.0721        | 21.78 | 9200  | 0.7941          | 1.0227 |
| 0.0658        | 22.72 | 9600  | 0.8100          | 1.0219 |
| 0.0627        | 23.67 | 10000 | 0.7592          | 1.0196 |
| 0.0591        | 24.62 | 10400 | 0.8028          | 1.0210 |
| 0.0537        | 25.56 | 10800 | 0.8019          | 1.0253 |
| 0.0507        | 26.51 | 11200 | 0.7951          | 1.0212 |
| 0.0495        | 27.46 | 11600 | 0.7893          | 1.0207 |
| 0.0466        | 28.4  | 12000 | 0.7854          | 1.0188 |
| 0.0431        | 29.35 | 12400 | 0.7883          | 1.0199 |


### Framework versions

- Transformers 4.32.1
- Pytorch 2.2.0+cu121
- Datasets 2.12.0
- Tokenizers 0.13.2