metadata

language:
  - de
license: apache-2.0
tags:
  - automatic-speech-recognition
  - mozilla-foundation/common_voice_9_0
  - generated_from_trainer
datasets:
  - mozilla-foundation/common_voice_9_0
base_model: ./facebook/wav2vec2-large-xlsr-53
model-index:
  - name: wav2vec2-large-xlsr-53-german-cv9
    results:
      - task:
          type: automatic-speech-recognition
          name: Automatic Speech Recognition
        dataset:
          name: Common Voice 9
          type: mozilla-foundation/common_voice_9_0
          args: de
        metrics:
          - type: wer
            value: 9.48066328184077
            name: Test WER
          - type: cer
            value: 1.9167347943074393
            name: Test CER
          - type: wer
            value: 7.49027762774117
            name: Test WER (+LM)
          - type: cer
            value: 1.9167347943074393
            name: Test CER  (+LM)
      - task:
          type: automatic-speech-recognition
          name: Automatic Speech Recognition
        dataset:
          name: Common Voice 6.1
          type: common_voice
          args: de
        metrics:
          - type: wer
            value: 8.122005951166669
            name: Test WER
          - type: cer
            value: 1
            name: Test CER
          - type: wer
            value: 6.145318204520354
            name: Test WER (+LM)
          - type: cer
            value: 1.5247743373447677
            name: Test CER (+LM)

wav2vec2-large-xlsr-53-german-cv9

This model is a fine-tuned version of ./facebook/wav2vec2-large-xlsr-53 on the MOZILLA-FOUNDATION/COMMON_VOICE_9_0 - DE dataset.

It achieves the following results on the test set:

CER: 2.273015898213336
Wer: 9.480663281840769

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0001
train_batch_size: 16
eval_batch_size: 32
seed: 42
gradient_accumulation_steps: 8
total_train_batch_size: 128
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_ratio: 0.1
num_epochs: 50.0
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss	Eval Wer
0.4129	1.0	3557	0.3015	0.2499
0.2121	2.0	7114	0.1596	0.1567
0.1455	3.0	10671	0.1377	0.1354
0.1436	4.0	14228	0.1301	0.1282
0.1144	5.0	17785	0.1225	0.1245
0.1219	6.0	21342	0.1254	0.1208
0.104	7.0	24899	0.1198	0.1232
0.1016	8.0	28456	0.1149	0.1174
0.1093	9.0	32013	0.1186	0.1186
0.0858	10.0	35570	0.1182	0.1164
0.102	11.0	39127	0.1191	0.1186
0.0834	12.0	42684	0.1161	0.1096
0.0916	13.0	46241	0.1147	0.1107
0.0811	14.0	49798	0.1174	0.1136
0.0814	15.0	53355	0.1132	0.1114
0.0865	16.0	56912	0.1134	0.1097
0.0701	17.0	60469	0.1096	0.1054
0.0891	18.0	64026	0.1110	0.1076
0.071	19.0	67583	0.1141	0.1074
0.0726	20.0	71140	0.1094	0.1093
0.0647	21.0	74697	0.1088	0.1095
0.0643	22.0	78254	0.1105	0.1044
0.0764	23.0	81811	0.1072	0.1042
0.0605	24.0	85368	0.1095	0.1026
0.0722	25.0	88925	0.1144	0.1066
0.0597	26.0	92482	0.1087	0.1022
0.062	27.0	96039	0.1073	0.1027
0.0536	28.0	99596	0.1068	0.1027
0.0616	29.0	103153	0.1097	0.1037
0.0642	30.0	106710	0.1117	0.1020
0.0555	31.0	110267	0.1109	0.0990
0.0632	32.0	113824	0.1104	0.0977
0.0482	33.0	117381	0.1108	0.0958
0.0601	34.0	120938	0.1095	0.0957
0.0508	35.0	124495	0.1079	0.0973
0.0526	36.0	128052	0.1068	0.0967
0.0487	37.0	131609	0.1081	0.0966
0.0495	38.0	135166	0.1099	0.0956
0.0528	39.0	138723	0.1091	0.0923
0.0439	40.0	142280	0.1111	0.0928
0.0467	41.0	145837	0.1131	0.0943
0.0407	42.0	149394	0.1115	0.0944
0.046	43.0	152951	0.1106	0.0935
0.0447	44.0	156508	0.1083	0.0919
0.0434	45.0	160065	0.1093	0.0909
0.0472	46.0	163622	0.1092	0.0921
0.0414	47.0	167179	0.1106	0.0922
0.0501	48.0	170736	0.1094	0.0918
0.0388	49.0	174293	0.1099	0.0918
0.0428	50.0	177850	0.1103	0.0915

Framework versions

Transformers 4.19.0.dev0
Pytorch 1.11.0+cu113
Datasets 2.0.0
Tokenizers 0.11.6