Surrogate Signals from Format and Length: Reinforcement Learning for Solving Mathematical Problems without Ground Truth Answers Paper • 2505.19439 • Published 22 days ago • 30
Surrogate Signals from Format and Length: Reinforcement Learning for Solving Mathematical Problems without Ground Truth Answers Paper • 2505.19439 • Published 22 days ago • 30 • 2
rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking Paper • 2501.04519 • Published Jan 8 • 280 • 43
rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking Paper • 2501.04519 • Published Jan 8 • 280
O1 Replication Journey -- Part 3: Inference-time Scaling for Medical Reasoning Paper • 2501.06458 • Published Jan 11 • 32
Search-o1: Agentic Search-Enhanced Large Reasoning Models Paper • 2501.05366 • Published Jan 9 • 100
Do NOT Think That Much for 2+3=? On the Overthinking of o1-Like LLMs Paper • 2412.21187 • Published Dec 30, 2024 • 42
OmniEval Collection An Omnidirectional and Automatic RAG Evaluation Benchmark in Financial Domain • 7 items • Updated Jan 2 • 2
OmniEval Collection An Omnidirectional and Automatic RAG Evaluation Benchmark in Financial Domain • 7 items • Updated Jan 2 • 2
SCOPE: Optimizing Key-Value Cache Compression in Long-context Generation Paper • 2412.13649 • Published Dec 18, 2024 • 20
Progressive Multimodal Reasoning via Active Retrieval Paper • 2412.14835 • Published Dec 19, 2024 • 74