metadata
license: apache-2.0
tags:
- generated_from_trainer
datasets:
- common_voice
model-index:
- name: wav2vec2-large-xls-r-300m-Arabic-phoneme-based
results: []
wav2vec2-large-xls-r-300m-Arabic-phoneme-based
This model is a fine-tuned version of facebook/wav2vec2-xls-r-300m on the common_voice dataset and local dataset. It achieves the following results on the evaluation set:
- Loss: 0.7848
- Per: 0.2061
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0005
- train_batch_size: 16
- eval_batch_size: 6
- seed: 42
- gradient_accumulation_steps: 4
- total_train_batch_size: 64
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 250
- num_epochs: 20.0
Training results
Training Loss | Epoch | Step | Validation Loss | Per |
---|---|---|---|---|
3.7127 | 1.0 | 222 | 1.9854 | 1.0 |
1.9148 | 2.0 | 445 | 1.8780 | 0.9165 |
1.8497 | 3.0 | 667 | 1.8433 | 0.9122 |
1.779 | 4.0 | 890 | 1.7892 | 0.9056 |
1.7023 | 5.0 | 1112 | 1.7313 | 0.8936 |
1.6223 | 6.0 | 1335 | 1.6278 | 0.8729 |
1.5323 | 7.0 | 1557 | 1.4546 | 0.6137 |
1.2216 | 8.0 | 1780 | 0.9798 | 0.3830 |
0.8624 | 9.0 | 2002 | 0.7331 | 0.3021 |
0.6687 | 10.0 | 2225 | 0.6287 | 0.2529 |
0.5645 | 11.0 | 2447 | 0.5874 | 0.2290 |
0.4973 | 12.0 | 2670 | 0.5660 | 0.2140 |
0.4528 | 13.0 | 2892 | 0.5099 | 0.1967 |
0.412 | 14.0 | 3115 | 0.5045 | 0.1918 |
0.3837 | 15.0 | 3337 | 0.4800 | 0.1913 |
0.3519 | 16.0 | 3560 | 0.4698 | 0.1827 |
0.333 | 17.0 | 3782 | 0.4623 | 0.1802 |
0.3137 | 18.0 | 4005 | 0.4499 | 0.1714 |
0.297 | 19.0 | 4227 | 0.4446 | 0.1707 |
0.2874 | 19.96 | 4440 | 0.4393 | 0.1697 |
Framework versions
- Transformers 4.30.2
- Pytorch 2.0.1+cu118
- Datasets 1.18.3
- Tokenizers 0.13.3