InternVideo2: Scaling Video Foundation Models for Multimodal Video Understanding Paper • 2403.15377 • Published Mar 22 • 22