nvidia
/

parakeet-tdt_ctc-1.1b

Automatic Speech Recognition

hf-asr-leaderboard

Model card Files Files and versions Community

nithinraok commited on May 7

Commit

3b91bdf

•

1 Parent(s): 7239fb0

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -177,7 +177,7 @@ img {
 `parakeet-hyb-pnc-1.1b` is an ASR model that transcribes speech with Punctuations and Capitalizations of English alphabet. This model is jointly developed by [NVIDIA NeMo](https://github.com/NVIDIA/NeMo) and [Suno.ai](https://www.suno.ai/) teams.
-It is an XXL version of Hybrid FastConformer [1] TDT-CTC [2] (around 1.1B parameters) model.
 See the [model architecture](#model-architecture) section and [NeMo documentation](https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/main/asr/models.html#fast-conformer) for complete architecture details.
 ## NVIDIA NeMo: Training

 `parakeet-hyb-pnc-1.1b` is an ASR model that transcribes speech with Punctuations and Capitalizations of English alphabet. This model is jointly developed by [NVIDIA NeMo](https://github.com/NVIDIA/NeMo) and [Suno.ai](https://www.suno.ai/) teams.
+It is an XXL version of Hybrid FastConformer [1] TDT-CTC [2] (around 1.1B parameters) model. This model has been trained with Local Attention and Global token hence this model can transcribe **11 hrs** of audio in one single pass. And for reference this model can transcibe 90mins of audio in <16 sec on A100.
 See the [model architecture](#model-architecture) section and [NeMo documentation](https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/main/asr/models.html#fast-conformer) for complete architecture details.
 ## NVIDIA NeMo: Training