LLM Reasoning - a ashioyajotham Collection

ashioyajotham 's Collections

safety

Scale

VLMs

LLM Reasoning

updated 19 days ago

Teaching Large Language Models to Reason with Reinforcement Learning

Paper • 2403.04642 • Published Mar 7, 2024 • 46
How Far Are We from Intelligent Visual Deductive Reasoning?

Paper • 2403.04732 • Published Mar 7, 2024 • 21
Common 7B Language Models Already Possess Strong Math Capabilities

Paper • 2403.04706 • Published Mar 7, 2024 • 18
DeepSeek-Prover: Advancing Theorem Proving in LLMs through Large-Scale Synthetic Data

Paper • 2405.14333 • Published May 23, 2024 • 38
Towards General-Purpose Model-Free Reinforcement Learning

Paper • 2501.16142 • Published 20 days ago • 26
SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training

Paper • 2501.17161 • Published 19 days ago • 105