espnet
/

WavLabLM-MS-40k

self-supervised-learning

speech-recognition

Model card Files Files and versions Community

wanchichen commited on Oct 2, 2023

Commit

d4cf80d

•

1 Parent(s): 78a2f20

Update README.md

Files changed (1) hide show

README.md +10 -0

README.md CHANGED Viewed

@@ -127,6 +127,16 @@ This model was trained by [William Chen](https://wanchichen.github.io/) using ES
 WavLabLM is an self-supervised audio encoder pre-trained on 40,000 hours of multilingual data across 136 languages. This specific variant, WavLabLM-MS, went through a second stage of pre-training on a balanced subset of the data to improve performance on lower-resource languages.
 It achieves comparable performance to XLS-R 128 on the [ML-SUPERB Benchmark](https://arxiv.org/abs/2305.10615) with only 10% of the pre-training data.
 ### Citing ESPnet

 WavLabLM is an self-supervised audio encoder pre-trained on 40,000 hours of multilingual data across 136 languages. This specific variant, WavLabLM-MS, went through a second stage of pre-training on a balanced subset of the data to improve performance on lower-resource languages.
 It achieves comparable performance to XLS-R 128 on the [ML-SUPERB Benchmark](https://arxiv.org/abs/2305.10615) with only 10% of the pre-training data.
+```BibTex
+@misc{chen2023joint,
+      title={Joint Prediction and Denoising for Large-scale Multilingual Self-supervised Learning},
+      author={William Chen and Jiatong Shi and Brian Yan and Dan Berrebbi and Wangyou Zhang and Yifan Peng and Xuankai Chang and Soumi Maiti and Shinji Watanabe},
+      year={2023},
+      eprint={2309.15317},
+      archivePrefix={arXiv},
+      primaryClass={cs.CL}
+}
+```
 ### Citing ESPnet