Pre-training Auto-regressive Robotic Models with 4D Representations Paper • 2502.13142 • Published 24 days ago • 4
QLASS: Boosting Language Agent Inference via Q-Guided Stepwise Search Paper • 2502.02584 • Published Feb 4 • 17
Zero-Shot Novel View and Depth Synthesis with Multi-View Geometric Diffusion Paper • 2501.18804 • Published Jan 30 • 5
MonST3R: A Simple Approach for Estimating Geometry in the Presence of Motion Paper • 2410.03825 • Published Oct 4, 2024 • 19
CameraCtrl: Enabling Camera Control for Text-to-Video Generation Paper • 2404.02101 • Published Apr 2, 2024 • 23
Panda-70M: Captioning 70M Videos with Multiple Cross-Modality Teachers Paper • 2402.19479 • Published Feb 29, 2024 • 34