Introduction

This is a small streaming zipformer model developed by Xiaomi AI Lab Next-gen-Kaldi team. The model was trained on around 20,0000 hours of open-sourced Chinese and English datasets. The number of parameters is around 25M (for ctc head), 35M (for transducer head).

The performance on some popular test sets (CER for Chinese, WER for English).

The chunk-size=16 and left-context-frames=128

Head	aishell test 1 / 2	wenetspeech test-net/meetting	Common Voice zh	kespeech test	librispeech test-clean / other	gigaspeech test	Common voice en	tedium test
CTC	6.7 / 7.24	12.92 / 16.45	17.18	23.32	19.4 / 29.66	26.18	33.52	17.67
Transducer	5.69 / 6.26	12.06 / 16.13	16.51	22.29	8.15 / 16.91	19.77	28.54	14.23

Please refer to zipformer in github for model details.

Training set list: Librispeech, Gigaspeech, Commonvoice-2022(zh + en), Libriheavy, Emilia (zh+en), AIshell 2, Wenetspeech, Wenetspeech4tts, Kespeech, AIshell, aidatatang, aishell4, alimeeting, magicdata, primewords, stcmds, thchs30.

Documentation

Please refer to https://pkufool.github.io/zipformer/en/models/

Citation

@inproceedings{yao2024zipformer,
  title={Zipformer: A faster and better encoder for automatic speech recognition},
  author={Yao, Zengwei and Guo, Liyong and Yang, Xiaoyu and Kang, Wei and Kuang, Fangjun and Yang, Yifan and Jin, Zengrui and Lin, Long and Povey, Daniel},
  booktitle={International Conference on Learning Representations},
  volume={2024},
  pages={44440--44455},
  year={2024}
}
@inproceedings{yao2025cr,
  title={Cr-ctc: Consistency regularization on ctc for improved speech recognition},
  author={Yao, Zengwei and Kang, Wei and Yang, Xiaoyu and Kuang, Fangjun and Guo, Liyong and Zhu, Han and Jin, Zengrui and Li, Zhaoqing and Lin, Long and Povey, Daniel},
  booktitle={International Conference on Learning Representations},
  volume={2025},
  pages={26850--26868},
  year={2025}
}

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collection including pkufool/zipformer-small-streaming

zipformer

Collection

zipformer asr & kws models. • 7 items • Updated about 7 hours ago

Paper for pkufool/zipformer-small-streaming

Zipformer: A faster and better encoder for automatic speech recognition

Paper • 2310.11230 • Published Oct 17, 2023 • 1