LLaVA-o1: Let Vision Language Models Reason Step-by-Step Paper • 2411.10440 • Published 9 days ago • 96
Vintern-1B: An Efficient Multimodal Large Language Model for Vietnamese Paper • 2408.12480 • Published Aug 22 • 17
InternVL 2.0 Collection Expanding Performance Boundaries of Open-Source MLLM • 17 items • Updated 3 days ago • 79