Eurus Collection Advancing LLM Reasoning Generalists with Preference Trees • 11 items • Updated Apr 15 • 22
DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models Paper • 2401.06066 • Published Jan 11 • 35
Retentive Network: A Successor to Transformer for Large Language Models Paper • 2307.08621 • Published Jul 17, 2023 • 167