Video-XL: Extra-Long Vision Language Model for Hour-Scale Video Understanding Paper • 2409.14485 • Published Sep 22
MegaPairs: Massive Data Synthesis For Universal Multimodal Retrieval Paper • 2412.14475 • Published 4 days ago • 47
MegaPairs: Massive Data Synthesis For Universal Multimodal Retrieval Paper • 2412.14475 • Published 4 days ago • 47
MegaPairs: Massive Data Synthesis For Universal Multimodal Retrieval Paper • 2412.14475 • Published 4 days ago • 47 • 2
view article Article Key Insights into the Law of Vision Representations in MLLMs By Borise • Sep 2 • 18