Update README.md
Browse files
README.md
CHANGED
@@ -12,7 +12,7 @@ This is a monolingual Slovak Wav2Vec 2.0 base model pre-trained from 17 thousand
|
|
12 |
|
13 |
This model does not have a tokenizer as it was pretrained on audio alone. In order to use this model for speech recognition, a tokenizer should be created, and the model should be fine-tuned on labeled data.
|
14 |
|
15 |
-
The model was initialized from [fav-kky/wav2vec2-base-cs-80k-ClTRUS](https://huggingface.co/fav-kky/wav2vec2-base-cs-80k-ClTRUS)
|
16 |
|
17 |
## Pretraining data
|
18 |
Almost 18 thousand hours of unlabeled Slovak speech:
|
@@ -51,24 +51,29 @@ After fine-tuning, the model scored the following results on public datasets:
|
|
51 |
See our paper for details.
|
52 |
|
53 |
## Paper
|
54 |
-
The preprint of our paper (accepted to TSD 2023) is available at
|
55 |
|
56 |
## Citation
|
57 |
If you find this model useful, please cite our paper:
|
58 |
```
|
59 |
-
@inproceedings{wav2vec2-base-
|
60 |
title = {{Transfer Learning of Transformer-based Speech Recognition Models from Czech to Slovak}},
|
61 |
author = {
|
62 |
Jan Lehe\v{c}ka and
|
63 |
Josef V. Psutka and
|
64 |
Josef Psutka
|
65 |
},
|
66 |
-
booktitle = {{
|
67 |
-
publisher = {{Springer}},
|
68 |
-
year = {
|
69 |
note = {(in press)},
|
|
|
70 |
}
|
71 |
```
|
72 |
|
73 |
-
## Related
|
|
|
|
|
|
|
|
|
74 |
- [fav-kky/wav2vec2-base-cs-80k-ClTRUS](https://huggingface.co/fav-kky/wav2vec2-base-cs-80k-ClTRUS)
|
|
|
12 |
|
13 |
This model does not have a tokenizer as it was pretrained on audio alone. In order to use this model for speech recognition, a tokenizer should be created, and the model should be fine-tuned on labeled data.
|
14 |
|
15 |
+
The model was initialized from Czech pre-trained model [fav-kky/wav2vec2-base-cs-80k-ClTRUS](https://huggingface.co/fav-kky/wav2vec2-base-cs-80k-ClTRUS). We found this cross-language transfer learning approach better than pre-training from scratch. See our paper for details.
|
16 |
|
17 |
## Pretraining data
|
18 |
Almost 18 thousand hours of unlabeled Slovak speech:
|
|
|
51 |
See our paper for details.
|
52 |
|
53 |
## Paper
|
54 |
+
The preprint of our paper (accepted to TSD 2023) is available at https://arxiv.org/abs/2306.04399.
|
55 |
|
56 |
## Citation
|
57 |
If you find this model useful, please cite our paper:
|
58 |
```
|
59 |
+
@inproceedings{wav2vec2-base-sk-17k,
|
60 |
title = {{Transfer Learning of Transformer-based Speech Recognition Models from Czech to Slovak}},
|
61 |
author = {
|
62 |
Jan Lehe\v{c}ka and
|
63 |
Josef V. Psutka and
|
64 |
Josef Psutka
|
65 |
},
|
66 |
+
booktitle = {{Text, Speech, and Dialogue}},
|
67 |
+
publisher = {{Springer International Publishing}},
|
68 |
+
year = {2023},
|
69 |
note = {(in press)},
|
70 |
+
url = {https://arxiv.org/abs/2306.04399},
|
71 |
}
|
72 |
```
|
73 |
|
74 |
+
## Related papers
|
75 |
+
- [INTERSPEECH 2022 - Exploring Capabilities of Monolingual Audio Transformers using Large Datasets in Automatic Speech Recognition of Czech](https://www.isca-speech.org/archive/pdfs/interspeech_2022/lehecka22_interspeech.pdf)
|
76 |
+
- INTERSPEECH 2023 - Transformer-based Speech Recognition Models for Oral History Archives in English, German, and Czech
|
77 |
+
|
78 |
+
## Related models
|
79 |
- [fav-kky/wav2vec2-base-cs-80k-ClTRUS](https://huggingface.co/fav-kky/wav2vec2-base-cs-80k-ClTRUS)
|