Learning Video Representations without Natural Videos Paper • 2410.24213 • Published 26 days ago • 14
ADEM-VL: Adaptive and Embedded Fusion for Efficient Vision-Language Tuning Paper • 2410.17779 • Published Oct 23 • 7