nrshoudi's picture
Update README.md
0a88dff
|
raw
history blame
2.66 kB
metadata
license: apache-2.0
tags:
  - generated_from_trainer
datasets:
  - common_voice
model-index:
  - name: wav2vec2-large-xls-r-300m-Arabic-phoneme-based
    results: []

wav2vec2-large-xls-r-300m-Arabic-phoneme-based

This model is a fine-tuned version of facebook/wav2vec2-xls-r-300m on the common_voice dataset and local dataset. It achieves the following results on the evaluation set:

  • Loss: 0.7848
  • Per: 0.2061

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0005
  • train_batch_size: 16
  • eval_batch_size: 6
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 64
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 250
  • num_epochs: 20.0

Training results

Training Loss Epoch Step Validation Loss Per
3.7127 1.0 222 1.9854 1.0
1.9148 2.0 445 1.8780 0.9165
1.8497 3.0 667 1.8433 0.9122
1.779 4.0 890 1.7892 0.9056
1.7023 5.0 1112 1.7313 0.8936
1.6223 6.0 1335 1.6278 0.8729
1.5323 7.0 1557 1.4546 0.6137
1.2216 8.0 1780 0.9798 0.3830
0.8624 9.0 2002 0.7331 0.3021
0.6687 10.0 2225 0.6287 0.2529
0.5645 11.0 2447 0.5874 0.2290
0.4973 12.0 2670 0.5660 0.2140
0.4528 13.0 2892 0.5099 0.1967
0.412 14.0 3115 0.5045 0.1918
0.3837 15.0 3337 0.4800 0.1913
0.3519 16.0 3560 0.4698 0.1827
0.333 17.0 3782 0.4623 0.1802
0.3137 18.0 4005 0.4499 0.1714
0.297 19.0 4227 0.4446 0.1707
0.2874 19.96 4440 0.4393 0.1697

Framework versions

  • Transformers 4.30.2
  • Pytorch 2.0.1+cu118
  • Datasets 1.18.3
  • Tokenizers 0.13.3