Any2Caption:Interpreting Any Condition to Caption for Controllable Video Generation Paper • 2503.24379 • Published 23 days ago • 75
Wan: Open and Advanced Large-Scale Video Generative Models Paper • 2503.20314 • Published 28 days ago • 49
Position: Interactive Generative Video as Next-Generation Game Engine Paper • 2503.17359 • Published Mar 21 • 62
DiffMoE: Dynamic Token Selection for Scalable Diffusion Transformers Paper • 2503.14487 • Published Mar 18 • 27 • 5
DiffMoE: Dynamic Token Selection for Scalable Diffusion Transformers Paper • 2503.14487 • Published Mar 18 • 27 • 5
DiffMoE: Dynamic Token Selection for Scalable Diffusion Transformers Paper • 2503.14487 • Published Mar 18 • 27 • 5
DiffMoE: Dynamic Token Selection for Scalable Diffusion Transformers Paper • 2503.14487 • Published Mar 18 • 27 • 5
DiffMoE: Dynamic Token Selection for Scalable Diffusion Transformers Paper • 2503.14487 • Published Mar 18 • 27
DiffMoE: Dynamic Token Selection for Scalable Diffusion Transformers Paper • 2503.14487 • Published Mar 18 • 27
ReCamMaster: Camera-Controlled Generative Rendering from A Single Video Paper • 2503.11647 • Published Mar 14 • 135
MM-Eureka: Exploring Visual Aha Moment with Rule-based Large-scale Reinforcement Learning Paper • 2503.07365 • Published Mar 10 • 60
ZeroBench: An Impossible Visual Benchmark for Contemporary Large Multimodal Models Paper • 2502.09696 • Published Feb 13 • 44
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning Paper • 2501.12948 • Published Jan 22 • 385
3DTopia-XL: Scaling High-quality 3D Asset Generation via Primitive Diffusion Paper • 2409.12957 • Published Sep 19, 2024 • 22
Oryx MLLM: On-Demand Spatial-Temporal Understanding at Arbitrary Resolution Paper • 2409.12961 • Published Sep 19, 2024 • 26