Inference-Time Scaling for Complex Tasks: Where We Stand and What Lies Ahead Paper • 2504.00294 • Published 22 days ago • 10
I Have Covered All the Bases Here: Interpreting Reasoning Features in Large Language Models via Sparse Autoencoders Paper • 2503.18878 • Published 30 days ago • 117
Think Twice: Enhancing LLM Reasoning by Scaling Multi-round Test-time Thinking Paper • 2503.19855 • Published 29 days ago • 26
UI-R1: Enhancing Action Prediction of GUI Agents by Reinforcement Learning Paper • 2503.21620 • Published 27 days ago • 59
Where do Large Vision-Language Models Look at when Answering Questions? Paper • 2503.13891 • Published Mar 18 • 8
On the Acquisition of Shared Grammatical Representations in Bilingual Language Models Paper • 2503.03962 • Published Mar 5 • 3
LINGOLY-TOO: Disentangling Memorisation from Reasoning with Linguistic Templatisation and Orthographic Obfuscation Paper • 2503.02972 • Published Mar 4 • 25
AppAgentX: Evolving GUI Agents as Proficient Smartphone Users Paper • 2503.02268 • Published Mar 4 • 11
Predictive Data Selection: The Data That Predicts Is the Data That Teaches Paper • 2503.00808 • Published Mar 2 • 57