Instruction Pre-Training: Language Models are Supervised Multitask Learners Paper β’ 2406.14491 β’ Published Jun 20, 2024 β’ 87
The FineWeb Datasets: Decanting the Web for the Finest Text Data at Scale Paper β’ 2406.17557 β’ Published Jun 25, 2024 β’ 90
KTO: Model Alignment as Prospect Theoretic Optimization Paper β’ 2402.01306 β’ Published Feb 2, 2024 β’ 16
NuminaMath Collection Datasets and models for training SOTA math LLMs. See our GitHub for training & inference code: https://github.com/project-numina/aimo-progress-prize β’ 6 items β’ Updated Jul 21, 2024 β’ 70
Large Language Models Can Self-Improve in Long-context Reasoning Paper β’ 2411.08147 β’ Published Nov 12, 2024 β’ 63
O1 Replication Journey -- Part 2: Surpassing O1-preview through Simple Distillation, Big Progress or Bitter Lesson? Paper β’ 2411.16489 β’ Published Nov 25, 2024 β’ 41
Quiet-STaR: Language Models Can Teach Themselves to Think Before Speaking Paper β’ 2403.09629 β’ Published Mar 14, 2024 β’ 75
V-STaR: Training Verifiers for Self-Taught Reasoners Paper β’ 2402.06457 β’ Published Feb 9, 2024 β’ 9
Learn Beyond The Answer: Training Language Models with Reflection for Mathematical Reasoning Paper β’ 2406.12050 β’ Published Jun 17, 2024 β’ 19
Top LLM Collection Collection of TOP Open Source LLM, Sort by Best on top β’ 6 items β’ Updated Jul 26, 2024 β’ 13
Large Language Models Orchestrating Structured Reasoning Achieve Kaggle Grandmaster Level Paper β’ 2411.03562 β’ Published Nov 5, 2024 β’ 65
What Happened in LLMs Layers when Trained for Fast vs. Slow Thinking: A Gradient Perspective Paper β’ 2410.23743 β’ Published Oct 31, 2024 β’ 59
LLMs Know More Than They Show: On the Intrinsic Representation of LLM Hallucinations Paper β’ 2410.02707 β’ Published Oct 3, 2024 β’ 48