MoRL: Reinforced Reasoning for Unified Motion Understanding and Generation Paper • 2602.14534 • Published 2 days ago • 2
Light4D: Training-Free Extreme Viewpoint 4D Video Relighting Paper • 2602.11769 • Published 6 days ago • 2
Code2Worlds: Empowering Coding LLMs for 4D World Generation Paper • 2602.11757 • Published 6 days ago • 3
GeneralVLA: Generalizable Vision-Language-Action Models with Knowledge-Guided Trajectory Planning Paper • 2602.04315 • Published 14 days ago • 1
VaseVQA: Multimodal Agent and Benchmark for Ancient Greek Pottery Paper • 2509.17191 • Published Sep 21, 2025 • 1
3D CoCa v2: Contrastive Learners with Test-Time Search for Generalizable Spatial Intelligence Paper • 2601.06496 • Published Jan 10 • 1
3D CoCa v2: Contrastive Learners with Test-Time Search for Generalizable Spatial Intelligence Paper • 2601.06496 • Published Jan 10 • 1
DriveGen3D: Boosting Feed-Forward Driving Scene Generation with Efficient Video Diffusion Paper • 2510.15264 • Published Oct 17, 2025 • 4
EgoLCD: Egocentric Video Generation with Long Context Diffusion Paper • 2512.04515 • Published Dec 4, 2025 • 6
BlockVid: Block Diffusion for High-Quality and Consistent Minute-Long Video Generation Paper • 2511.22973 • Published Nov 28, 2025 • 7
MobileVLA-R1: Reinforcing Vision-Language-Action for Mobile Robots Paper • 2511.17889 • Published Nov 22, 2025 • 5
Inferix: A Block-Diffusion based Next-Generation Inference Engine for World Simulation Paper • 2511.20714 • Published Nov 25, 2025 • 50
VLA-R1: Enhancing Reasoning in Vision-Language-Action Models Paper • 2510.01623 • Published Oct 2, 2025 • 12
VolSplat: Rethinking Feed-Forward 3D Gaussian Splatting with Voxel-Aligned Prediction Paper • 2509.19297 • Published Sep 23, 2025 • 25