Video-LaVIT: Unified Video-Language Pre-training with Decoupled Visual-Motional Tokenization Paper • 2402.03161 • Published Feb 5 • 14