metadata

language:
  - de
license: apache-2.0
tags:
  - automatic-speech-recognition
  - mozilla-foundation/common_voice_9_0
  - generated_from_trainer
datasets:
  - common_voice
model-index:
  - name: wav2vec2-base-german-cv9
    results: []

wav2vec2-base-german-cv9

This model is a fine-tuned version of facebook/wav2vec2-base on the MOZILLA-FOUNDATION/COMMON_VOICE_9_0 - DE dataset. It achieves the following results on the evaluation set:

Loss: 0.1742
Wer: 0.1209

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0001
train_batch_size: 16
eval_batch_size: 32
seed: 42
gradient_accumulation_steps: 8
total_train_batch_size: 128
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_ratio: 0.1
num_epochs: 50.0
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss	Wer
0.6827	1.0	3557	0.6695	0.6247
0.3992	2.0	7114	0.3738	0.3936
0.2611	3.0	10671	0.3011	0.3177
0.2536	4.0	14228	0.2672	0.2749
0.1943	5.0	17785	0.2487	0.2480
0.2004	6.0	21342	0.2246	0.2268
0.1605	7.0	24899	0.2176	0.2120
0.1579	8.0	28456	0.2046	0.2024
0.1668	9.0	32013	0.2027	0.1944
0.1338	10.0	35570	0.1968	0.1854
0.1478	11.0	39127	0.1963	0.1823
0.1177	12.0	42684	0.1956	0.1800
0.1245	13.0	46241	0.1889	0.1732
0.1124	14.0	49798	0.1868	0.1714
0.1112	15.0	53355	0.1805	0.1650
0.1209	16.0	56912	0.1860	0.1614
0.1002	17.0	60469	0.1828	0.1604
0.118	18.0	64026	0.1832	0.1580
0.0974	19.0	67583	0.1771	0.1555
0.1007	20.0	71140	0.1812	0.1532
0.0866	21.0	74697	0.1752	0.1504
0.0901	22.0	78254	0.1690	0.1477
0.0964	23.0	81811	0.1773	0.1489
0.085	24.0	85368	0.1776	0.1456
0.0945	25.0	88925	0.1786	0.1428
0.0804	26.0	92482	0.1737	0.1429
0.0832	27.0	96039	0.1789	0.1394
0.0683	28.0	99596	0.1741	0.1390
0.0761	29.0	103153	0.1688	0.1379
0.0833	30.0	106710	0.1726	0.1370
0.0753	31.0	110267	0.1774	0.1353
0.08	32.0	113824	0.1734	0.1344
0.0644	33.0	117381	0.1737	0.1334
0.0745	34.0	120938	0.1763	0.1335
0.0629	35.0	124495	0.1761	0.1311
0.0654	36.0	128052	0.1718	0.1302
0.0656	37.0	131609	0.1697	0.1301
0.0643	38.0	135166	0.1716	0.1279
0.0683	39.0	138723	0.1777	0.1279
0.0587	40.0	142280	0.1735	0.1271
0.0693	41.0	145837	0.1780	0.1260
0.0532	42.0	149394	0.1724	0.1245
0.0594	43.0	152951	0.1736	0.1250
0.0544	44.0	156508	0.1744	0.1238
0.0559	45.0	160065	0.1770	0.1232
0.0557	46.0	163622	0.1766	0.1231
0.0521	47.0	167179	0.1751	0.1220
0.0591	48.0	170736	0.1724	0.1217
0.0507	49.0	174293	0.1753	0.1212
0.0577	50.0	177850	0.1742	0.1209

Framework versions

Transformers 4.20.1
Pytorch 1.11.0+cu113
Datasets 2.0.0
Tokenizers 0.11.6