Introduction

This is a large streaming zipformer model developed by Xiaomi AI Lab Next-gen-Kaldi team. The model was trained on around 20,0000 hours of open-sourced Chinese and English datasets. The number of parameters is around 150M.

The performance on some popular test sets (CER for Chinese, WER for English).

The chunk-size=16 and left-context-frames=128

Head	aishell test 1 / 2	wenetspeech test-net/meetting	Common Voice zh	kespeech test	librispeech test-clean / other	gigaspeech test	Common voice en	tedium test
CTC	3.78 / 4.71	8.65 / 10.54	11.8	15.35	3.74 / 8.5	12.32	19.7	10.92
Transducer	3.53 / 4.48	8.31 / 10.27	11.99	14.83	3.26 / 7.51	11.77	17.53	10.82

Please refer to zipformer in github for model details.

Training set list: Librispeech, Gigaspeech, Commonvoice-2022(zh + en), Libriheavy, Emilia (zh+en), AIshell 2, Wenetspeech, Wenetspeech4tts, Kespeech, AIshell, aidatatang, aishell4, alimeeting, magicdata, primewords, stcmds, thchs30.

Documentation

Please refer to https://pkufool.github.io/zipformer/en/models/

Citation

@inproceedings{yao2024zipformer,
  title={Zipformer: A faster and better encoder for automatic speech recognition},
  author={Yao, Zengwei and Guo, Liyong and Yang, Xiaoyu and Kang, Wei and Kuang, Fangjun and Yang, Yifan and Jin, Zengrui and Lin, Long and Povey, Daniel},
  booktitle={International Conference on Learning Representations},
  volume={2024},
  pages={44440--44455},
  year={2024}
}
@inproceedings{yao2025cr,
  title={Cr-ctc: Consistency regularization on ctc for improved speech recognition},
  author={Yao, Zengwei and Kang, Wei and Yang, Xiaoyu and Kuang, Fangjun and Guo, Liyong and Zhu, Han and Jin, Zengrui and Li, Zhaoqing and Lin, Long and Povey, Daniel},
  booktitle={International Conference on Learning Representations},
  volume={2025},
  pages={26850--26868},
  year={2025}
}

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collection including pkufool/zipformer-large-streaming

zipformer

Collection

zipformer asr & kws models. • 7 items • Updated about 11 hours ago

Paper for pkufool/zipformer-large-streaming

Zipformer: A faster and better encoder for automatic speech recognition

Paper • 2310.11230 • Published Oct 17, 2023 • 1