Automatic Speech Recognition
NeMo
PyTorch
English
speech
audio
Transducer
TDT
FastConformer
Conformer
NeMo
hf-asr-leaderboard
Eval Results
nithinraok commited on
Commit
3b91bdf
1 Parent(s): 7239fb0

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -177,7 +177,7 @@ img {
177
 
178
 
179
  `parakeet-hyb-pnc-1.1b` is an ASR model that transcribes speech with Punctuations and Capitalizations of English alphabet. This model is jointly developed by [NVIDIA NeMo](https://github.com/NVIDIA/NeMo) and [Suno.ai](https://www.suno.ai/) teams.
180
- It is an XXL version of Hybrid FastConformer [1] TDT-CTC [2] (around 1.1B parameters) model.
181
  See the [model architecture](#model-architecture) section and [NeMo documentation](https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/main/asr/models.html#fast-conformer) for complete architecture details.
182
 
183
  ## NVIDIA NeMo: Training
 
177
 
178
 
179
  `parakeet-hyb-pnc-1.1b` is an ASR model that transcribes speech with Punctuations and Capitalizations of English alphabet. This model is jointly developed by [NVIDIA NeMo](https://github.com/NVIDIA/NeMo) and [Suno.ai](https://www.suno.ai/) teams.
180
+ It is an XXL version of Hybrid FastConformer [1] TDT-CTC [2] (around 1.1B parameters) model. This model has been trained with Local Attention and Global token hence this model can transcribe **11 hrs** of audio in one single pass. And for reference this model can transcibe 90mins of audio in <16 sec on A100.
181
  See the [model architecture](#model-architecture) section and [NeMo documentation](https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/main/asr/models.html#fast-conformer) for complete architecture details.
182
 
183
  ## NVIDIA NeMo: Training