fxtentacle
/

wav2vec2-xls-r-1b-tevr

Automatic Speech Recognition

hf-asr-leaderboard

Model card Files Files and versions Community

fxtentacle commited on Jun 28, 2022

Commit

0503bb2

•

1 Parent(s): a0402ee

Update README.md

Files changed (1) hide show

README.md +21 -4

README.md CHANGED Viewed

@@ -35,19 +35,36 @@ model-index:
 ## Overview
 This folder contains a fully trained German speech recognition pipeline
-consisting of an acoustic model using the new wav2vec 2.0 XLS-R 1B TEVR architecture
 and a 5-gram KenLM language model.
 For an explanation of the TEVR enhancements and their motivation, please see our paper:
-TEVR: Improving XLS-R for German ASR through Token Entropy Variance Reduction
-(Krabbenhöft et al., 2022).
 This pipeline scores a very competitive (as of June 2022) **word error rate of 3.64%** on CommonVoice German.
 To evalue this pipeline yourself and/or on your own data, see the `HF Eval Script.ipynb` Jupyter Notebook
 or use the following python script:
-## Evaluation
 ```python
 !pip install --quiet --root-user-action=ignore --upgrade pip

 ## Overview
 This folder contains a fully trained German speech recognition pipeline
+consisting of an acoustic model using the new wav2vec 2.0 XLS-R 1B **TEVR** architecture
 and a 5-gram KenLM language model.
 For an explanation of the TEVR enhancements and their motivation, please see our paper:
+[TEVR: Improving Speech Recognition by Token Entropy Variance Reduction](https://arxiv.org/abs/2206.12693).
 This pipeline scores a very competitive (as of June 2022) **word error rate of 3.64%** on CommonVoice German.
+## Citation
+If you use this ASR pipeline for research, please cite:
+```bibtex
+@misc{https://doi.org/10.48550/arxiv.2206.12693,
+  doi = {10.48550/ARXIV.2206.12693},
+  url = {https://arxiv.org/abs/2206.12693},
+  author = {Krabbenhöft, Hajo Nils and Barth, Erhardt},
+  keywords = {Computation and Language (cs.CL), Sound (cs.SD), Audio and Speech Processing (eess.AS), FOS: Computer and information sciences, FOS: Computer and information sciences, FOS: Electrical engineering, electronic engineering, information engineering, FOS: Electrical engineering, electronic engineering, information engineering, F.2.1; I.2.6; I.2.7},
+  title = {TEVR: Improving Speech Recognition by Token Entropy Variance Reduction},
+  publisher = {arXiv},
+  year = {2022},
+  copyright = {Creative Commons Attribution 4.0 International}
+}
+```
+## Evaluation
 To evalue this pipeline yourself and/or on your own data, see the `HF Eval Script.ipynb` Jupyter Notebook
 or use the following python script:
 ```python
 !pip install --quiet --root-user-action=ignore --upgrade pip