dpykhtar commited on
Commit
c8a3553
1 Parent(s): 9bc9708

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -3
README.md CHANGED
@@ -31,7 +31,7 @@ img {
31
  | [![Riva Compatible](https://img.shields.io/badge/NVIDIA%20Riva-compatible-brightgreen#model-badge)](#deployment-with-nvidia-riva) |
32
 
33
  This model transcribes speech in lowercase Ukrainian alphabet including spaces and apostrophes, and is trained on 69 hours of Ukrainian speech data.
34
- It is a non-autoregressive "large" variant of Streaming Citrinet, with around 141 million parameters.
35
  See the [model architecture](#model-architecture) section and [NeMo documentation](https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/main/asr/models.html#conformer-ctc) for complete architecture details.
36
  It is also compatible with NVIDIA Riva for [production-grade server deployments](#deployment-with-nvidia-riva).
37
 
@@ -88,7 +88,7 @@ The tokenizer for this models was built using the text transcripts of the train
88
 
89
  ### Datasets
90
 
91
- Model is trained on Mozilla Common Voice Corpus 10.0 dataset comprising of 69 hours of Ukrainian speech.
92
 
93
  ## Limitations
94
 
@@ -107,4 +107,5 @@ Check out [Riva live demo](https://developer.nvidia.com/riva#demos).
107
 
108
  [1] [Citrinet: Closing the Gap between Non-Autoregressive and Autoregressive End-to-End Models for Automatic Speech Recognition](https://arxiv.org/abs/2104.01721) <br />
109
  [2] [Google Sentencepiece Tokenizer](https://github.com/google/sentencepiece) <br />
110
- [3] [NVIDIA NeMo Toolkit](https://github.com/NVIDIA/NeMo)
 
 
31
  | [![Riva Compatible](https://img.shields.io/badge/NVIDIA%20Riva-compatible-brightgreen#model-badge)](#deployment-with-nvidia-riva) |
32
 
33
  This model transcribes speech in lowercase Ukrainian alphabet including spaces and apostrophes, and is trained on 69 hours of Ukrainian speech data.
34
+ It is a non-autoregressive "large" variant of Streaming Citrinet, with around 141 million parameters. Model is fine-tuned with pre-trained Russian Citrinet-1024 model on Ukrainian speech data using Cross-Language Transfer Learning [4] approach.
35
  See the [model architecture](#model-architecture) section and [NeMo documentation](https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/main/asr/models.html#conformer-ctc) for complete architecture details.
36
  It is also compatible with NVIDIA Riva for [production-grade server deployments](#deployment-with-nvidia-riva).
37
 
 
88
 
89
  ### Datasets
90
 
91
+ Model is trained on validated Mozilla Common Voice Corpus 10.0 dataset(excluding dev and test data) comprising of 69 hours of Ukrainian speech.
92
 
93
  ## Limitations
94
 
 
107
 
108
  [1] [Citrinet: Closing the Gap between Non-Autoregressive and Autoregressive End-to-End Models for Automatic Speech Recognition](https://arxiv.org/abs/2104.01721) <br />
109
  [2] [Google Sentencepiece Tokenizer](https://github.com/google/sentencepiece) <br />
110
+ [3] [NVIDIA NeMo Toolkit](https://github.com/NVIDIA/NeMo) <br />
111
+ [4] [Cross-Language Transfer Learning](https://scholar.google.com/citations?view_op=view_citation&hl=en&user=qmmIGnwAAAAJ&sortby=pubdate&citation_for_view=qmmIGnwAAAAJ:PVjk1bu6vJQC)