Every Expert Matters: Towards Effective Knowledge Distillation for Mixture-of-Experts Language Models Paper • 2502.12947 • Published 27 days ago
XiYanSQL Models Collection The XiYanSQL series are foundational SQL models available in various sizes, including 3B, 7B, 14B, and 32B. • 4 items • Updated 8 days ago • 6