InternVid: A Large-scale Video-Text Dataset for Multimodal Understanding and Generation Paper • 2307.06942 • Published Jul 13, 2023 • 22