Claw-SWE-Bench: A Benchmark for Evaluating OpenClaw-style Agent Harnesses on Coding Tasks Paper • 2606.12344 • Published 1 day ago • 31
Imaginative Perception Tokens Enhance Spatial Reasoning in Multimodal Language Models Paper • 2606.03988 • Published 8 days ago • 114
AlphaTransit: Learning to Design City-scale Transit Routes Paper • 2605.28730 • Published 15 days ago • 7
DarkForest: Less Talk, Higher Accuracy for Multi-Agent LLMs Paper • 2605.25188 • Published 18 days ago • 15
PANDO: Efficient Multimodal AI Agents via Online Skill Distillation Paper • 2605.24785 • Published 16 days ago • 11
Gamma-World: Generative Multi-Agent World Modeling Beyond Two Players Paper • 2605.28816 • Published 15 days ago • 423