BLINK: Multimodal Large Language Models Can See but Not Perceive Paper • 2404.12390 • Published Apr 18 • 23
Reka Core, Flash, and Edge: A Series of Powerful Multimodal Language Models Paper • 2404.12387 • Published Apr 18 • 35
MoAI: Mixture of All Intelligence for Large Language and Vision Models Paper • 2403.07508 • Published Mar 12 • 70
Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context Paper • 2403.05530 • Published Mar 8 • 50
Wukong: Towards a Scaling Law for Large-Scale Recommendation Paper • 2403.02545 • Published Mar 4 • 15
Learning and Leveraging World Models in Visual Representation Learning Paper • 2403.00504 • Published Mar 1 • 25
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits Paper • 2402.17764 • Published Feb 27 • 566
PromptBench: A Unified Library for Evaluation of Large Language Models Paper • 2312.07910 • Published Dec 13, 2023 • 14
How Well Does GPT-4V(ision) Adapt to Distribution Shifts? A Preliminary Investigation Paper • 2312.07424 • Published Dec 12, 2023 • 7