README.md · anuragshas/wav2vec2-xls-r-300m-bn-cv9-with-lm at main

metadata

language:
  - bn
license: apache-2.0
tags:
  - automatic-speech-recognition
  - mozilla-foundation/common_voice_9_0
  - generated_from_trainer
datasets:
  - mozilla-foundation/common_voice_9_0
metrics:
  - wer
model-index:
  - name: XLS-R-300M - Bengali
    results:
      - task:
          type: automatic-speech-recognition
          name: Speech Recognition
        dataset:
          type: mozilla-foundation/common_voice_9_0
          name: Common Voice 9
          args: bn
        metrics:
          - type: wer
            value: 20.15
            name: Test WER
          - name: Test CER
            type: cer
            value: 4.813

This model is a fine-tuned version of facebook/wav2vec2-xls-r-300m on the MOZILLA-FOUNDATION/COMMON_VOICE_9_0 - BN dataset. It achieves the following results on the evaluation set:

Loss: 0.2297
Wer: 0.2850
Cer: 0.0660

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 7.5e-05
train_batch_size: 64
eval_batch_size: 64
seed: 42
gradient_accumulation_steps: 2
total_train_batch_size: 128
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_ratio: 0.1
training_steps: 8692
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss	Wer	Cer
3.675	2.3	400	3.5052	1.0	1.0
3.0446	4.6	800	2.2759	1.0052	0.5215
1.7276	6.9	1200	0.7083	0.6697	0.1969
1.5171	9.2	1600	0.5328	0.5733	0.1568
1.4176	11.49	2000	0.4571	0.5161	0.1381
1.343	13.79	2400	0.3910	0.4522	0.1160
1.2743	16.09	2800	0.3534	0.4137	0.1044
1.2396	18.39	3200	0.3278	0.3877	0.0959
1.2035	20.69	3600	0.3109	0.3741	0.0917
1.1745	22.99	4000	0.2972	0.3618	0.0882
1.1541	25.29	4400	0.2836	0.3427	0.0832
1.1372	27.59	4800	0.2759	0.3357	0.0812
1.1048	29.89	5200	0.2669	0.3284	0.0783
1.0966	32.18	5600	0.2678	0.3249	0.0775
1.0747	34.48	6000	0.2547	0.3134	0.0748
1.0593	36.78	6400	0.2491	0.3077	0.0728
1.0417	39.08	6800	0.2450	0.3012	0.0711
1.024	41.38	7200	0.2402	0.2956	0.0694
1.0106	43.68	7600	0.2351	0.2915	0.0681
1.0014	45.98	8000	0.2328	0.2896	0.0673
0.9999	48.28	8400	0.2318	0.2866	0.0667

Framework versions

Transformers 4.19.0.dev0
Pytorch 1.11.0+cu102
Datasets 2.1.1.dev0
Tokenizers 0.12.1