eharper commited on
Commit
a7e0aba
1 Parent(s): 43d18c7

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +12 -12
README.md CHANGED
@@ -147,18 +147,6 @@ model-index:
147
  This model transcribes speech in lower case English alphabet along with spaces and apostrophes.
148
  It is a "extra-large" versions of Conformer-Transducer (around 600M parameters) model.
149
 
150
- ## NVIDIA Riva: Deployment
151
-
152
- If you like this and other models from NVIDIA (i.e., CTC-based Conformers) check out [NVIDIA Riva](https://developer.nvidia.com/riva), an accelerated speech AI SDK deployable on-prem, in all clouds, multi-cloud, hybrid, on edge, and embedded. This model, as well as other RNNT-based models are currently not supported by Riva. You can find the list of models supported by Riva [here](https://docs.nvidia.com/deeplearning/riva/user-guide/docs/reference/models/index.html).
153
-
154
- Additionally, Riva provides:
155
-
156
- * World-class out-of-the-box accuracy for the most common languages with model checkpoints trained on proprietary data with hundreds of thousands of GPU-compute hours
157
- * Best in class accuracy via customization with run-time word boosting (e.g., brand and product names), acoustic model training, language model training, and inverse text normalization customizations
158
- * Streaming speech recognition, Kubernetes compatible scaling, and Enterprise-grade support
159
-
160
- Check out [Riva live demo](https://developer.nvidia.com/riva#demos).
161
-
162
  ## NVIDIA NeMo: Training
163
 
164
  To train, fine-tune or play with the model you will need to install [NVIDIA NeMo](https://github.com/NVIDIA/NeMo). We recommend you install it after you've installed latest Pytorch version.
@@ -203,6 +191,18 @@ This model accepts 16000 KHz Mono-channel Audio (wav files) as input.
203
 
204
  This model provides transcribed speech as a string for a given audio sample.
205
 
 
 
 
 
 
 
 
 
 
 
 
 
206
  ## Model Architecture
207
 
208
  Conformer-Transducer model is an autoregressive variant of Conformer model [1] for Automatic Speech Recognition which uses Transducer loss/decoding instead of CTC Loss. You may find more info on the detail of this model here: [Conformer-CTC Model](https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/main/asr/models.html).
 
147
  This model transcribes speech in lower case English alphabet along with spaces and apostrophes.
148
  It is a "extra-large" versions of Conformer-Transducer (around 600M parameters) model.
149
 
 
 
 
 
 
 
 
 
 
 
 
 
150
  ## NVIDIA NeMo: Training
151
 
152
  To train, fine-tune or play with the model you will need to install [NVIDIA NeMo](https://github.com/NVIDIA/NeMo). We recommend you install it after you've installed latest Pytorch version.
 
191
 
192
  This model provides transcribed speech as a string for a given audio sample.
193
 
194
+ ## NVIDIA Riva: Deployment
195
+
196
+ If you like this and other models from NVIDIA (i.e., CTC-based Conformers) check out [NVIDIA Riva](https://developer.nvidia.com/riva), an accelerated speech AI SDK deployable on-prem, in all clouds, multi-cloud, hybrid, on edge, and embedded. This model, as well as other RNNT-based models are currently not supported by Riva. You can find the list of models supported by Riva [here](https://docs.nvidia.com/deeplearning/riva/user-guide/docs/reference/models/index.html).
197
+
198
+ Additionally, Riva provides:
199
+
200
+ * World-class out-of-the-box accuracy for the most common languages with model checkpoints trained on proprietary data with hundreds of thousands of GPU-compute hours
201
+ * Best in class accuracy via customization with run-time word boosting (e.g., brand and product names), acoustic model training, language model training, and inverse text normalization customizations
202
+ * Streaming speech recognition, Kubernetes compatible scaling, and Enterprise-grade support
203
+
204
+ Check out [Riva live demo](https://developer.nvidia.com/riva#demos).
205
+
206
  ## Model Architecture
207
 
208
  Conformer-Transducer model is an autoregressive variant of Conformer model [1] for Automatic Speech Recognition which uses Transducer loss/decoding instead of CTC Loss. You may find more info on the detail of this model here: [Conformer-CTC Model](https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/main/asr/models.html).