patrickvonplaten commited on
Commit
408690c
1 Parent(s): 419a9e3

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +19 -0
README.md ADDED
@@ -0,0 +1,19 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ Fine-tuning of `wav2vec2-base` on 100h of Librispeech training data. Results on "clean" data are very similar to the ones of the [official model](https://huggingface.co/facebook/wav2vec2-base-100h). However, the result on "other" is significantly worse - the model seems to have overfitting to the "clean" data.
2
+
3
+ Model was trained on *librispeech-clean-train.100* with following hyper-parameters:
4
+
5
+ - 2 GPUs Titan RTX
6
+ - Total update steps 13000
7
+ - Batch size per GPU: 32 corresponding to a *total batch size* of ca. ~1500 seconds
8
+ - Adam with linear decaying learning rate with 3000 warmup steps
9
+ - dynamic grouping for batch
10
+ - fp16
11
+ - attention_mask was **not** used during training
12
+
13
+ Check: https://wandb.ai/patrickvonplaten/huggingface/reports/Project-Dashboard--Vmlldzo1MDI2MTU?accessToken=69z0mrkoxs1msgh71p4nntr9shi6mll8rhtbo6c56yynygw0scp11d8z9o1xd0uk
14
+
15
+ *Result (WER)* on Librispeech test:
16
+
17
+ | "clean" | "other" |
18
+ |---|---|
19
+ | 6.5 | 18.7 |