Babel: Open Multilingual Large Language Models Serving Over 90% of Global Speakers Paper • 2503.00865 • Published Mar 2 • 64
Stop Overthinking: A Survey on Efficient Reasoning for Large Language Models Paper • 2503.16419 • Published Mar 20 • 71
Large Language Model Agent: A Survey on Methodology, Applications and Challenges Paper • 2503.21460 • Published 28 days ago • 77
I Have Covered All the Bases Here: Interpreting Reasoning Features in Large Language Models via Sparse Autoencoders Paper • 2503.18878 • Published Mar 24 • 118
S1-Bench: A Simple Benchmark for Evaluating System 1 Thinking Capability of Large Reasoning Models Paper • 2504.10368 • Published 10 days ago • 21
VLM-R1: A Stable and Generalizable R1-style Large Vision-Language Model Paper • 2504.07615 • Published 15 days ago • 30
Scaling Laws for Native Multimodal Models Scaling Laws for Native Multimodal Models Paper • 2504.07951 • Published 14 days ago • 27
AgentRewardBench: Evaluating Automatic Evaluations of Web Agent Trajectories Paper • 2504.08942 • Published 13 days ago • 27
Landscape of Thoughts: Visualizing the Reasoning Process of Large Language Models Paper • 2503.22165 • Published 28 days ago • 27
Quantization Hurts Reasoning? An Empirical Study on Quantized Reasoning Models Paper • 2504.04823 • Published 18 days ago • 30
ReTool: Reinforcement Learning for Strategic Tool Use in LLMs Paper • 2504.11536 • Published 9 days ago • 58
PaperBench: Evaluating AI's Ability to Replicate AI Research Paper • 2504.01848 • Published 22 days ago • 36
ColorBench: Can VLMs See and Understand the Colorful World? A Comprehensive Benchmark for Color Perception, Reasoning, and Robustness Paper • 2504.10514 • Published 14 days ago • 45
VL-Rethinker: Incentivizing Self-Reflection of Vision-Language Models with Reinforcement Learning Paper • 2504.08837 • Published 14 days ago • 42
Understanding R1-Zero-Like Training: A Critical Perspective Paper • 2503.20783 • Published 29 days ago • 45