-
Branch-Train-MiX: Mixing Expert LLMs into a Mixture-of-Experts LLM
Paper • 2403.07816 • Published • 33 -
Gemma: Open Models Based on Gemini Research and Technology
Paper • 2403.08295 • Published • 41 -
MoAI: Mixture of All Intelligence for Large Language and Vision Models
Paper • 2403.07508 • Published • 65 -
MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training
Paper • 2403.09611 • Published • 110
Shyam Sunder Kumar
theainerd
AI & ML interests
Natural Language Processing
Organizations
Collections
1
models
2
datasets
None public yet