Big-Math Collection This collection contains assets associated with the Big-Math dataset, a high-quality collection of over 250,000 math questions with verifiable answers • 3 items • Updated 3 days ago • 3
Big-Math: A Large-Scale, High-Quality Math Dataset for Reinforcement Learning in Language Models Paper • 2502.17387 • Published 13 days ago • 5
Cognitive Behaviors that Enable Self-Improving Reasoners, or, Four Habits of Highly Effective STaRs Paper • 2503.01307 • Published 7 days ago • 30
FSPO: Few-Shot Preference Optimization of Synthetic Preference Data in LLMs Elicits Effective Personalization to Real Users Paper • 2502.19312 • Published 11 days ago • 5
Towards System 2 Reasoning in LLMs: Learning How to Think With Meta Chain-of-Though Paper • 2501.04682 • Published Jan 8 • 91
OpenCoder: The Open Cookbook for Top-Tier Code Large Language Models Paper • 2411.04905 • Published Nov 7, 2024 • 115
Direct Preference Optimization Using Sparse Feature-Level Constraints Paper • 2411.07618 • Published Nov 12, 2024 • 16
PERSONA Collection Collection of various datasets related to the PERSONA paper. • 5 items • Updated Oct 28, 2024 • 2
NuminaMath Collection Datasets and models for training SOTA math LLMs. See our GitHub for training & inference code: https://github.com/project-numina/aimo-progress-prize • 7 items • Updated 27 days ago • 76
PERSONA: A Reproducible Testbed for Pluralistic Alignment Paper • 2407.17387 • Published Jul 24, 2024 • 20
Suppressing Pink Elephants with Direct Principle Feedback Paper • 2402.07896 • Published Feb 12, 2024 • 11