Seedream 2.0: A Native Chinese-English Bilingual Image Generation Foundation Model Paper • 2503.07703 • Published 25 days ago • 34
LMM-R1: Empowering 3B LMMs with Strong Reasoning Abilities Through Two-Stage Rule-Based RL Paper • 2503.07536 • Published 25 days ago • 83
TrajectoryCrafter: Redirecting Camera Trajectory for Monocular Videos via Diffusion Models Paper • 2503.05638 • Published 28 days ago • 18
R1-Zero's "Aha Moment" in Visual Reasoning on a 2B Non-SFT Model Paper • 2503.05132 • Published 29 days ago • 52
Unified Reward Model for Multimodal Understanding and Generation Paper • 2503.05236 • Published 28 days ago • 112
Effective and Efficient Masked Image Generation Models Paper • 2503.07197 • Published 25 days ago • 11
EasyControl: Adding Efficient and Flexible Control for Diffusion Transformer Paper • 2503.07027 • Published 25 days ago • 27
AnyMoLe: Any Character Motion In-betweening Leveraging Video Diffusion Models Paper • 2503.08417 • Published 24 days ago • 8
Reangle-A-Video: 4D Video Generation as Video-to-Video Translation Paper • 2503.09151 • Published 23 days ago • 30
Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models Paper • 2503.09573 • Published 23 days ago • 67
PoseLess: Depth-Free Vision-to-Joint Control via Direct Image Mapping with VLM Paper • 2503.07111 • Published 25 days ago • 3
Piece it Together: Part-Based Concepting with IP-Priors Paper • 2503.10365 • Published 22 days ago • 8
Autoregressive Image Generation with Randomized Parallel Decoding Paper • 2503.10568 • Published 22 days ago • 8
Open-Sora 2.0: Training a Commercial-Level Video Generation Model in $200k Paper • 2503.09642 • Published 24 days ago • 17