metadata

license: apache-2.0
base_model: facebook/wav2vec2-xls-r-300m
tags:
  - generated_from_trainer
datasets:
  - common_voice_17_0
metrics:
  - wer
model-index:
  - name: xlsr-mk
    results:
      - task:
          name: Automatic Speech Recognition
          type: automatic-speech-recognition
        dataset:
          name: common_voice_17_0
          type: common_voice_17_0
          config: mk
          split: validation
          args: mk
        metrics:
          - name: Wer
            type: wer
            value: 0.4437212531458821

xlsr-mk

This model is a fine-tuned version of facebook/wav2vec2-xls-r-300m on the common_voice_17_0 dataset. It achieves the following results on the evaluation set:

Loss: 0.6273
Wer: 0.4437
Cer: 0.1074

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0003
train_batch_size: 16
eval_batch_size: 8
seed: 42
gradient_accumulation_steps: 2
total_train_batch_size: 32
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 500
num_epochs: 50
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss	Wer	Cer
3.541	1.8868	100	3.5532	1.0	1.0
2.966	3.7736	200	2.9438	1.0	1.0
2.298	5.6604	300	2.1673	1.0	0.7080
0.5999	7.5472	400	0.7521	0.7476	0.2035
0.3941	9.4340	500	0.7249	0.6911	0.1845
0.2226	11.3208	600	0.6970	0.6602	0.1725
0.3031	13.2075	700	0.7692	0.6506	0.1680
0.1621	15.0943	800	0.7229	0.6232	0.1583
0.2052	16.9811	900	0.6990	0.5722	0.1471
0.1441	18.8679	1000	0.6829	0.5591	0.1400
0.0548	20.7547	1100	0.6560	0.5309	0.1333
0.1312	22.6415	1200	0.6590	0.5375	0.1332
0.0582	24.5283	1300	0.7023	0.5268	0.1321
0.1163	26.4151	1400	0.6900	0.5170	0.1293
0.0491	28.3019	1500	0.6499	0.5089	0.1274
0.063	30.1887	1600	0.6478	0.4869	0.1221
0.0735	32.0755	1700	0.6678	0.4967	0.1256
0.0437	33.9623	1800	0.6651	0.4803	0.1188
0.0514	35.8491	1900	0.6741	0.4724	0.1168
0.0306	37.7358	2000	0.6564	0.4717	0.1168
0.0458	39.6226	2100	0.6428	0.4679	0.1140
0.0398	41.5094	2200	0.6385	0.4531	0.1103
0.0574	43.3962	2300	0.5991	0.4392	0.1063
0.0481	45.2830	2400	0.6394	0.4468	0.1087
0.0376	47.1698	2500	0.6184	0.4434	0.1072
0.0275	49.0566	2600	0.6273	0.4437	0.1074

Framework versions

Transformers 4.42.0.dev0
Pytorch 2.3.1+cu121
Datasets 2.19.2
Tokenizers 0.19.1