A Closer Look into Mixture-of-Experts in Large Language Models Paper • 2406.18219 • Published Jun 26, 2024 • 16
VisionZip: Longer is Better but Not Necessary in Vision Language Models Paper • 2412.04467 • Published Dec 5, 2024 • 105
p-MoD: Building Mixture-of-Depths MLLMs via Progressive Ratio Decay Paper • 2412.04449 • Published Dec 5, 2024 • 6
ReMoE: Fully Differentiable Mixture-of-Experts with ReLU Routing Paper • 2412.14711 • Published 24 days ago • 15