This model is finetuned on LibriSpeech 960h using a pretrained Hubert-L (https://arxiv.org/abs/2106.07447) published by fairseq.
The model is trained with pruned RNN-T loss (https://arxiv.org/abs/2206.13236). The WERs are 1.93/3.93 on test-clean/other.