Teaching Large Language Models to Reason with Reinforcement Learning Paper • 2403.04642 • Published Mar 7 • 46
PERL: Parameter Efficient Reinforcement Learning from Human Feedback Paper • 2403.10704 • Published Mar 15 • 56
Language Models as Compilers: Simulating Pseudocode Execution Improves Algorithmic Reasoning in Language Models Paper • 2404.02575 • Published Apr 3 • 46
MathScale: Scaling Instruction Tuning for Mathematical Reasoning Paper • 2403.02884 • Published Mar 5 • 15
ALPINE: Unveiling the Planning Capability of Autoregressive Learning in Language Models Paper • 2405.09220 • Published May 15 • 23
LLaMA-NAS: Efficient Neural Architecture Search for Large Language Models Paper • 2405.18377 • Published May 28 • 16