Look Light, Think Heavy: What Multimodal Chain-of-Thought Reasoning Can and Cannot Do Paper • 2606.22565 • Published 9 days ago • 9
NatureBench: Can Coding Agents Match the Published SOTA of Nature-Family Papers? Paper • 2606.24530 • Published 7 days ago • 61
Are Text-to-Image Models Inductivist Turkeys? A Counterfactual Benchmark for Causal Reasoning Paper • 2606.24548 • Published 7 days ago • 8
Semantic Browsing: Controllable Diversity for Image Generation Paper • 2606.23679 • Published 8 days ago • 20
FedOT: Ownership Verification and Leakage Tracing via Watermarks for Federated LDMs Paper • 2606.22875 • Published 8 days ago • 11
Grouped Query Experts: Mixture-of-Experts on GQA Self-Attention Paper • 2606.20945 • Published 12 days ago • 75
GateMem: Benchmarking Memory Governance in Multi-Principal Shared-Memory Agents Paper • 2606.18829 • Published 13 days ago • 18
ImageWAM: Do World Action Models Really Need Video Generation, or Just Image Editing? Paper • 2606.19531 • Published 13 days ago • 21
Reinforcing Dual-Path Reasoning in Spatial Vision Language Models Paper • 2606.17539 • Published 14 days ago • 15
Trust the Right Teacher: Quality-Aware Self-Distillation for GUI Grounding Paper • 2606.18101 • Published 14 days ago • 15
Learning from the Self-future: On-policy Self-distillation for dLLMs Paper • 2606.18195 • Published 14 days ago • 76
LoopCoder-v2: Only Loop Once for Efficient Test-Time Computation Scaling Paper • 2606.18023 • Published 14 days ago • 208
TRIAGE: Dialectical Reasoning for Explainable Risk Prediction on Irregularly Sampled Medical Time Series with LLMs Paper • 2606.09030 • Published 22 days ago • 30
VisualClaw: A Real-Time, Personalized Agent for the Physical World Paper • 2606.16295 • Published 15 days ago • 28
Smaller Models are Natural Explorers for Policy-Level Diversity in GRPO Paper • 2605.30789 • Published 28 days ago • 26
VIA-SD: Verification via Intra-Model Routing for Speculative Decoding Paper • 2606.12243 • Published 20 days ago • 35
Claw-SWE-Bench: A Benchmark for Evaluating OpenClaw-style Agent Harnesses on Coding Tasks Paper • 2606.12344 • Published 20 days ago • 70