Ola: Pushing the Frontiers of Omni-Modal Language Model with Progressive Modality Alignment Paper ā¢ 2502.04328 ā¢ Published Feb 6 ā¢ 29
MatAnyone: Stable Video Matting with Consistent Memory Propagation Paper ā¢ 2501.14677 ā¢ Published Jan 24 ā¢ 31
Video-MMMU: Evaluating Knowledge Acquisition from Multi-Discipline Professional Videos Paper ā¢ 2501.13826 ā¢ Published Jan 23 ā¢ 24
VideoLLaMA 3: Frontier Multimodal Foundation Models for Image and Video Understanding Paper ā¢ 2501.13106 ā¢ Published Jan 22 ā¢ 85
CityDreamer4D: Compositional Generative Model of Unbounded 4D Cities Paper ā¢ 2501.08983 ā¢ Published Jan 15 ā¢ 20
RepVideo: Rethinking Cross-Layer Representation for Video Generation Paper ā¢ 2501.08994 ā¢ Published Jan 15 ā¢ 15
RepVideo: Rethinking Cross-Layer Representation for Video Generation Paper ā¢ 2501.08994 ā¢ Published Jan 15 ā¢ 15
RepVideo: Rethinking Cross-Layer Representation for Video Generation Paper ā¢ 2501.08994 ā¢ Published Jan 15 ā¢ 15
RepVideo: Rethinking Cross-Layer Representation for Video Generation Paper ā¢ 2501.08994 ā¢ Published Jan 15 ā¢ 15
Are VLMs Ready for Autonomous Driving? An Empirical Study from the Reliability, Data, and Metric Perspectives Paper ā¢ 2501.04003 ā¢ Published Jan 7 ā¢ 25
Diffusion as Shader: 3D-aware Video Diffusion for Versatile Video Generation Control Paper ā¢ 2501.03847 ā¢ Published Jan 7 ā¢ 23
Diffusion as Shader: 3D-aware Video Diffusion for Versatile Video Generation Control Paper ā¢ 2501.03847 ā¢ Published Jan 7 ā¢ 23
Evaluation Agent: Efficient and Promptable Evaluation Framework for Visual Generative Models Paper ā¢ 2412.09645 ā¢ Published Dec 10, 2024 ā¢ 35
Evaluation Agent: Efficient and Promptable Evaluation Framework for Visual Generative Models Paper ā¢ 2412.09645 ā¢ Published Dec 10, 2024 ā¢ 35
VBench: Comprehensive Benchmark Suite for Video Generative Models Paper ā¢ 2311.17982 ā¢ Published Nov 29, 2023 ā¢ 8