Migician: Revealing the Magic of Free-Form Multi-Image Grounding in Multimodal Large Language Models Paper • 2501.05767 • Published 17 days ago • 28
StreamingBench: Assessing the Gap for MLLMs to Achieve Streaming Video Understanding Paper • 2411.03628 • Published Nov 6, 2024 • 2
StreamingBench: Assessing the Gap for MLLMs to Achieve Streaming Video Understanding Paper • 2411.03628 • Published Nov 6, 2024 • 2