This model is finetuned on LibriSpeech 960h using a pretrained Hubert-L (https://arxiv.org/abs/2106.07447) published by fairseq.

The model is trained with pruned RNN-T loss (https://arxiv.org/abs/2206.13236). The WERs are **1.93/3.93** on test-clean/other.