OmniDirector: General Multi-Shot Camera Cloning without Cross-Paired Data Paper • 2606.13432 • Published 5 days ago • 93
FORT-Searcher: Synthesizing Shortcut-Resistant Search Tasks for Training Deep Search Agents Paper • 2606.12087 • Published 6 days ago • 72
Claw-SWE-Bench: A Benchmark for Evaluating OpenClaw-style Agent Harnesses on Coding Tasks Paper • 2606.12344 • Published 6 days ago • 65
From Correctness to Utility: Gain-Based Prefix Evaluation for LLM Reasoning Paper • 2606.07190 • Published 11 days ago • 34
Toward Generalist Autonomous Research via Hypothesis-Tree Refinement Paper • 2606.11926 • Published 6 days ago • 110
Retrospective Harness Optimization: Improving LLM Agents via Self-Preference over Trajectory Rollouts Paper • 2606.05922 • Published 12 days ago • 52
Multi-view Consistent 3D Gaussian Head Avatars 'without' Multi-view Generation Paper • 2605.25220 • Published 23 days ago • 7
RoboStressBench: Benchmarking VLM Robustness to Physical Visual Stress in Embodied Scenes Paper • 2606.00828 • Published 17 days ago • 10
On the Scaling of PEFT: Towards Million Personal Models of Trillion Parameters Paper • 2606.02437 • Published 15 days ago • 228
Gamma-World: Generative Multi-Agent World Modeling Beyond Two Players Paper • 2605.28816 • Published 20 days ago • 423