17 Direct-a-Video: Customized Video Generation with User-Directed Camera Movement and Object Motion · 8 authors 1
16 InteractiveVideo: User-Centric Controllable Video Generation with Synergistic Multimodal Instructions · 6 authors 1
13 Video-LaVIT: Unified Video-Language Pre-training with Decoupled Visual-Motional Tokenization · 13 authors 2
12 Audio Flamingo: A Novel Audio Language Model with Few-Shot Learning and Dialogue Abilities · 6 authors 4