Chair-D
's Collections
Reasoning
updated
Chain-of-Thought Reasoning Without Prompting
Paper
•
2402.10200
•
Published
•
104
Teaching Large Language Models to Reason with Reinforcement Learning
Paper
•
2403.04642
•
Published
•
46
PERL: Parameter Efficient Reinforcement Learning from Human Feedback
Paper
•
2403.10704
•
Published
•
57
MathScale: Scaling Instruction Tuning for Mathematical Reasoning
Paper
•
2403.02884
•
Published
•
15
Language Models as Compilers: Simulating Pseudocode Execution Improves
Algorithmic Reasoning in Language Models
Paper
•
2404.02575
•
Published
•
48
Advancing LLM Reasoning Generalists with Preference Trees
Paper
•
2404.02078
•
Published
•
44
Iterative Reasoning Preference Optimization
Paper
•
2404.19733
•
Published
•
47
ALPINE: Unveiling the Planning Capability of Autoregressive Learning in
Language Models
Paper
•
2405.09220
•
Published
•
24
LLaMA-NAS: Efficient Neural Architecture Search for Large Language
Models
Paper
•
2405.18377
•
Published
•
18
Towards Building Specialized Generalist AI with System 1 and System 2
Fusion
Paper
•
2407.08642
•
Published
•
9