SkyReels-A2: Compose Anything in Video Diffusion Transformers Paper • 2504.02436 • Published 16 days ago • 35
Long-Context Autoregressive Video Modeling with Next-Frame Prediction Paper • 2503.19325 • Published 25 days ago • 71
XAttention: Block Sparse Attention with Antidiagonal Scoring Paper • 2503.16428 • Published 29 days ago • 14
MagicMotion: Controllable Video Generation with Dense-to-Sparse Trajectory Guidance Paper • 2503.16421 • Published 29 days ago • 9
JARVIS-VLA: Post-Training Large-Scale Vision Language Models to Play Visual Games with Keyboards and Mouse Paper • 2503.16365 • Published 29 days ago • 38
ReCamMaster: Camera-Controlled Generative Rendering from A Single Video Paper • 2503.11647 • Published Mar 14 • 134
CINEMA: Coherent Multi-Subject Video Generation via MLLM-Based Guidance Paper • 2503.10391 • Published Mar 13 • 10
Tuning-Free Multi-Event Long Video Generation via Synchronized Coupled Sampling Paper • 2503.08605 • Published Mar 11 • 26
Babel: Open Multilingual Large Language Models Serving Over 90% of Global Speakers Paper • 2503.00865 • Published Mar 2 • 63
DoraCycle: Domain-Oriented Adaptation of Unified Generative Model in Multimodal Cycles Paper • 2503.03651 • Published Mar 5 • 16
DoraCycle: Domain-Oriented Adaptation of Unified Generative Model in Multimodal Cycles Paper • 2503.03651 • Published Mar 5 • 16
DoraCycle: Domain-Oriented Adaptation of Unified Generative Model in Multimodal Cycles Paper • 2503.03651 • Published Mar 5 • 16 • 2
Difix3D+: Improving 3D Reconstructions with Single-Step Diffusion Models Paper • 2503.01774 • Published Mar 3 • 43