In-2-4D: Inbetweening from Two Single-View Images to 4D Generation Paper • 2504.08366 • Published 6 days ago • 5
PixelFlow: Pixel-Space Generative Models with Flow Paper • 2504.07963 • Published 6 days ago • 14
Visual Chronicles: Using Multimodal LLMs to Analyze Massive Collections of Images Paper • 2504.08727 • Published 5 days ago • 8
BlenderGym: Benchmarking Foundational Model Systems for Graphics Editing Paper • 2504.01786 • Published 14 days ago • 4
HoloPart: Generative 3D Part Amodal Segmentation Paper • 2504.07943 • Published 6 days ago • 25
VisualCloze: A Universal Image Generation Framework via Visual In-Context Learning Paper • 2504.07960 • Published 6 days ago • 39
RobustDexGrasp: Robust Dexterous Grasping of General Objects from Single-view Perception Paper • 2504.05287 • Published 9 days ago • 3
WildGS-SLAM: Monocular Gaussian Splatting SLAM in Dynamic Environments Paper • 2504.03886 • Published 12 days ago • 9
Masked Scene Modeling: Narrowing the Gap Between Supervised and Self-Supervised Learning in 3D Scene Understanding Paper • 2504.06719 • Published 8 days ago • 8
GenDoP: Auto-regressive Camera Trajectory Generation as a Director of Photography Paper • 2504.07083 • Published 7 days ago • 21
OmniSVG: A Unified Scalable Vector Graphics Generation Model Paper • 2504.06263 • Published 8 days ago • 141
One-Minute Video Generation with Test-Time Training Paper • 2504.05298 • Published 9 days ago • 93
VideoScene: Distilling Video Diffusion Model to Generate 3D Scenes in One Step Paper • 2504.01956 • Published 14 days ago • 38
AnimeGamer: Infinite Anime Life Simulation with Next Game State Prediction Paper • 2504.01014 • Published 15 days ago • 59
GeometryCrafter: Consistent Geometry Estimation for Open-world Videos with Diffusion Priors Paper • 2504.01016 • Published 15 days ago • 28
Any2Caption:Interpreting Any Condition to Caption for Controllable Video Generation Paper • 2503.24379 • Published 16 days ago • 74
TextCrafter: Accurately Rendering Multiple Texts in Complex Visual Scenes Paper • 2503.23461 • Published 17 days ago • 93
UPME: An Unsupervised Peer Review Framework for Multimodal Large Language Model Evaluation Paper • 2503.14941 • Published 29 days ago • 6
Progressive Rendering Distillation: Adapting Stable Diffusion for Instant Text-to-Mesh Generation without 3D Data Paper • 2503.21694 • Published 20 days ago • 16