fxtentacle commited on
Commit
0503bb2
1 Parent(s): a0402ee

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +21 -4
README.md CHANGED
@@ -35,19 +35,36 @@ model-index:
35
  ## Overview
36
 
37
  This folder contains a fully trained German speech recognition pipeline
38
- consisting of an acoustic model using the new wav2vec 2.0 XLS-R 1B TEVR architecture
39
  and a 5-gram KenLM language model.
40
  For an explanation of the TEVR enhancements and their motivation, please see our paper:
41
- TEVR: Improving XLS-R for German ASR through Token Entropy Variance Reduction
42
- (Krabbenhöft et al., 2022).
43
 
44
 
45
  This pipeline scores a very competitive (as of June 2022) **word error rate of 3.64%** on CommonVoice German.
46
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
47
  To evalue this pipeline yourself and/or on your own data, see the `HF Eval Script.ipynb` Jupyter Notebook
48
  or use the following python script:
49
 
50
- ## Evaluation
51
 
52
  ```python
53
  !pip install --quiet --root-user-action=ignore --upgrade pip
 
35
  ## Overview
36
 
37
  This folder contains a fully trained German speech recognition pipeline
38
+ consisting of an acoustic model using the new wav2vec 2.0 XLS-R 1B **TEVR** architecture
39
  and a 5-gram KenLM language model.
40
  For an explanation of the TEVR enhancements and their motivation, please see our paper:
41
+ [TEVR: Improving Speech Recognition by Token Entropy Variance Reduction](https://arxiv.org/abs/2206.12693).
 
42
 
43
 
44
  This pipeline scores a very competitive (as of June 2022) **word error rate of 3.64%** on CommonVoice German.
45
 
46
+ ## Citation
47
+
48
+ If you use this ASR pipeline for research, please cite:
49
+ ```bibtex
50
+ @misc{https://doi.org/10.48550/arxiv.2206.12693,
51
+ doi = {10.48550/ARXIV.2206.12693},
52
+ url = {https://arxiv.org/abs/2206.12693},
53
+ author = {Krabbenhöft, Hajo Nils and Barth, Erhardt},
54
+ keywords = {Computation and Language (cs.CL), Sound (cs.SD), Audio and Speech Processing (eess.AS), FOS: Computer and information sciences, FOS: Computer and information sciences, FOS: Electrical engineering, electronic engineering, information engineering, FOS: Electrical engineering, electronic engineering, information engineering, F.2.1; I.2.6; I.2.7},
55
+ title = {TEVR: Improving Speech Recognition by Token Entropy Variance Reduction},
56
+ publisher = {arXiv},
57
+ year = {2022},
58
+ copyright = {Creative Commons Attribution 4.0 International}
59
+ }
60
+ ```
61
+
62
+ ## Evaluation
63
+
64
  To evalue this pipeline yourself and/or on your own data, see the `HF Eval Script.ipynb` Jupyter Notebook
65
  or use the following python script:
66
 
67
+
68
 
69
  ```python
70
  !pip install --quiet --root-user-action=ignore --upgrade pip