35 DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models · 17 authors
20 Parrot: Pareto-optimal Multi-Reward Reinforcement Learning Framework for Text-to-Image Generation · 14 authors 1
18 Patchscope: A Unifying Framework for Inspecting Hidden Representations of Language Models · 5 authors
6 A Shocking Amount of the Web is Machine Translated: Insights from Multi-Way Parallelism · 5 authors
5 Tuning LLMs with Contrastive Alignment Instructions for Machine Translation in Unseen, Low-resource Languages · 2 authors