-
Orca 2: Teaching Small Language Models How to Reason
Paper • 2311.11045 • Published • 73 -
Learning From Mistakes Makes LLM Better Reasoner
Paper • 2310.20689 • Published • 29 -
Let's Verify Step by Step
Paper • 2305.20050 • Published • 10 -
SelfCheck: Using LLMs to Zero-Shot Check Their Own Step-by-Step Reasoning
Paper • 2308.00436 • Published • 22
Collections
Discover the best community collections!
Collections including paper arxiv:2501.04682
-
SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model
Paper • 2502.02737 • Published • 203 -
Demystifying Long Chain-of-Thought Reasoning in LLMs
Paper • 2502.03373 • Published • 57 -
Kimi k1.5: Scaling Reinforcement Learning with LLMs
Paper • 2501.12599 • Published • 104 -
SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training
Paper • 2501.17161 • Published • 108
-
Internal Consistency and Self-Feedback in Large Language Models: A Survey
Paper • 2407.14507 • Published • 46 -
Large Language Models are Zero-Shot Reasoners
Paper • 2205.11916 • Published • 1 -
Let's Verify Step by Step
Paper • 2305.20050 • Published • 10 -
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Paper • 2201.11903 • Published • 11
-
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning
Paper • 2501.12948 • Published • 346 -
Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach
Paper • 2502.05171 • Published • 124 -
Towards System 2 Reasoning in LLMs: Learning How to Think With Meta Chain-of-Though
Paper • 2501.04682 • Published • 91 -
rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking
Paper • 2501.04519 • Published • 263
-
rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking
Paper • 2501.04519 • Published • 263 -
Towards System 2 Reasoning in LLMs: Learning How to Think With Meta Chain-of-Though
Paper • 2501.04682 • Published • 91 -
Search-o1: Agentic Search-Enhanced Large Reasoning Models
Paper • 2501.05366 • Published • 95 -
Agent Laboratory: Using LLM Agents as Research Assistants
Paper • 2501.04227 • Published • 86
-
Evolving Deeper LLM Thinking
Paper • 2501.09891 • Published • 106 -
rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking
Paper • 2501.04519 • Published • 263 -
Towards System 2 Reasoning in LLMs: Learning How to Think With Meta Chain-of-Though
Paper • 2501.04682 • Published • 91
-
Dolphin: Closed-loop Open-ended Auto-research through Thinking, Practice, and Feedback
Paper • 2501.03916 • Published • 15 -
Towards System 2 Reasoning in LLMs: Learning How to Think With Meta Chain-of-Though
Paper • 2501.04682 • Published • 91 -
Agent Laboratory: Using LLM Agents as Research Assistants
Paper • 2501.04227 • Published • 86 -
Search-o1: Agentic Search-Enhanced Large Reasoning Models
Paper • 2501.05366 • Published • 95
-
rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking
Paper • 2501.04519 • Published • 263 -
Towards System 2 Reasoning in LLMs: Learning How to Think With Meta Chain-of-Though
Paper • 2501.04682 • Published • 91 -
Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters
Paper • 2408.03314 • Published • 57 -
Training Large Language Models to Reason in a Continuous Latent Space
Paper • 2412.06769 • Published • 78