Position-Enhanced Visual Instruction Tuning for Multimodal Large Language Models Paper • 2308.13437 • Published Aug 25, 2023 • 3
Browse and Concentrate: Comprehending Multimodal Content via prior-LLM Context Fusion Paper • 2402.12195 • Published Feb 19
CODIS: Benchmarking Context-Dependent Visual Comprehension for Multimodal Large Language Models Paper • 2402.13607 • Published Feb 21
ActiView: Evaluating Active Perception Ability for Multimodal Large Language Models Paper • 2410.04659 • Published Oct 7
StreamingBench: Assessing the Gap for MLLMs to Achieve Streaming Video Understanding Paper • 2411.03628 • Published 18 days ago • 2
StreamingBench: Assessing the Gap for MLLMs to Achieve Streaming Video Understanding Paper • 2411.03628 • Published 18 days ago • 2
LLMtimesMapReduce: Simplified Long-Sequence Processing using Large Language Models Paper • 2410.09342 • Published Oct 12 • 37