jlehecka commited on
Commit
e2f223b
1 Parent(s): c7522c4

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +12 -7
README.md CHANGED
@@ -12,7 +12,7 @@ This is a monolingual Slovak Wav2Vec 2.0 base model pre-trained from 17 thousand
12
 
13
  This model does not have a tokenizer as it was pretrained on audio alone. In order to use this model for speech recognition, a tokenizer should be created, and the model should be fine-tuned on labeled data.
14
 
15
- The model was initialized from [fav-kky/wav2vec2-base-cs-80k-ClTRUS](https://huggingface.co/fav-kky/wav2vec2-base-cs-80k-ClTRUS), so transfer learning from Czech to Slovak was used to pre-train the model, see our paper for details.
16
 
17
  ## Pretraining data
18
  Almost 18 thousand hours of unlabeled Slovak speech:
@@ -51,24 +51,29 @@ After fine-tuning, the model scored the following results on public datasets:
51
  See our paper for details.
52
 
53
  ## Paper
54
- The preprint of our paper (accepted to TSD 2023) is available at TBD
55
 
56
  ## Citation
57
  If you find this model useful, please cite our paper:
58
  ```
59
- @inproceedings{wav2vec2-base-cs-80k-ClTRUS,
60
  title = {{Transfer Learning of Transformer-based Speech Recognition Models from Czech to Slovak}},
61
  author = {
62
  Jan Lehe\v{c}ka and
63
  Josef V. Psutka and
64
  Josef Psutka
65
  },
66
- booktitle = {{TSD} 2023},
67
- publisher = {{Springer}},
68
- year = {2022},
69
  note = {(in press)},
 
70
  }
71
  ```
72
 
73
- ## Related works
 
 
 
 
74
  - [fav-kky/wav2vec2-base-cs-80k-ClTRUS](https://huggingface.co/fav-kky/wav2vec2-base-cs-80k-ClTRUS)
 
12
 
13
  This model does not have a tokenizer as it was pretrained on audio alone. In order to use this model for speech recognition, a tokenizer should be created, and the model should be fine-tuned on labeled data.
14
 
15
+ The model was initialized from Czech pre-trained model [fav-kky/wav2vec2-base-cs-80k-ClTRUS](https://huggingface.co/fav-kky/wav2vec2-base-cs-80k-ClTRUS). We found this cross-language transfer learning approach better than pre-training from scratch. See our paper for details.
16
 
17
  ## Pretraining data
18
  Almost 18 thousand hours of unlabeled Slovak speech:
 
51
  See our paper for details.
52
 
53
  ## Paper
54
+ The preprint of our paper (accepted to TSD 2023) is available at https://arxiv.org/abs/2306.04399.
55
 
56
  ## Citation
57
  If you find this model useful, please cite our paper:
58
  ```
59
+ @inproceedings{wav2vec2-base-sk-17k,
60
  title = {{Transfer Learning of Transformer-based Speech Recognition Models from Czech to Slovak}},
61
  author = {
62
  Jan Lehe\v{c}ka and
63
  Josef V. Psutka and
64
  Josef Psutka
65
  },
66
+ booktitle = {{Text, Speech, and Dialogue}},
67
+ publisher = {{Springer International Publishing}},
68
+ year = {2023},
69
  note = {(in press)},
70
+ url = {https://arxiv.org/abs/2306.04399},
71
  }
72
  ```
73
 
74
+ ## Related papers
75
+ - [INTERSPEECH 2022 - Exploring Capabilities of Monolingual Audio Transformers using Large Datasets in Automatic Speech Recognition of Czech](https://www.isca-speech.org/archive/pdfs/interspeech_2022/lehecka22_interspeech.pdf)
76
+ - INTERSPEECH 2023 - Transformer-based Speech Recognition Models for Oral History Archives in English, German, and Czech
77
+
78
+ ## Related models
79
  - [fav-kky/wav2vec2-base-cs-80k-ClTRUS](https://huggingface.co/fav-kky/wav2vec2-base-cs-80k-ClTRUS)