DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models Paper • 2402.03300 • Published Feb 5, 2024 • 110
SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training Paper • 2501.17161 • Published Jan 28 • 116
Running 2.35k 2.35k The Ultra-Scale Playbook 🌌 The ultimate guide to training LLM on large GPU Clusters