CriticBench: Benchmarking LLMs for Critique-Correct Reasoning Paper • 2402.14809 • Published Feb 22 • 3
DeepSeek-Prover-V1.5: Harnessing Proof Assistant Feedback for Reinforcement Learning and Monte-Carlo Tree Search Paper • 2408.08152 • Published Aug 15 • 52
DeepSeek-Coder-V2: Breaking the Barrier of Closed-Source Models in Code Intelligence Paper • 2406.11931 • Published Jun 17 • 57
CRITIC: Large Language Models Can Self-Correct with Tool-Interactive Critiquing Paper • 2305.11738 • Published May 19, 2023 • 7
Synthetic Prompting: Generating Chain-of-Thought Demonstrations for Large Language Models Paper • 2302.00618 • Published Feb 1, 2023
Enhancing Retrieval-Augmented Large Language Models with Iterative Retrieval-Generation Synergy Paper • 2305.15294 • Published May 24, 2023 • 1
ToRA: A Tool-Integrated Reasoning Agent for Mathematical Problem Solving Paper • 2309.17452 • Published Sep 29, 2023 • 3
Math-Shepherd: Verify and Reinforce LLMs Step-by-step without Human Annotations Paper • 2312.08935 • Published Dec 14, 2023 • 4
DeepSeek LLM: Scaling Open-Source Language Models with Longtermism Paper • 2401.02954 • Published Jan 5 • 41
DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models Paper • 2402.03300 • Published Feb 5 • 72
ToRA: A Tool-Integrated Reasoning Agent for Mathematical Problem Solving Paper • 2309.17452 • Published Sep 29, 2023 • 3
CRITIC: Large Language Models Can Self-Correct with Tool-Interactive Critiquing Paper • 2305.11738 • Published May 19, 2023 • 7