Adding Error Bars to Evals: A Statistical Approach to Language Model Evaluations Paper • 2411.00640 • Published 23 days ago • 3
The Prompt Report: A Systematic Survey of Prompting Techniques Paper • 2406.06608 • Published Jun 6 • 55
Training language models to follow instructions with human feedback Paper • 2203.02155 • Published Mar 4, 2022 • 16
RULER: What's the Real Context Size of Your Long-Context Language Models? Paper • 2404.06654 • Published Apr 9 • 34
Cosmos Tokenizer Collection A suite of image and video tokenizers • 10 items • Updated 18 days ago • 19
LLM Reasoning Papers Collection Papers to improve reasoning capabilities of LLMs • 15 items • Updated 22 days ago • 76
Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone Paper • 2404.14219 • Published Apr 22 • 254
Beyond A*: Better Planning with Transformers via Search Dynamics Bootstrapping Paper • 2402.14083 • Published Feb 21 • 47
OpenMathInstruct-1: A 1.8 Million Math Instruction Tuning Dataset Paper • 2402.10176 • Published Feb 15 • 36
DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models Paper • 2402.03300 • Published Feb 5 • 69
Chain-of-Table: Evolving Tables in the Reasoning Chain for Table Understanding Paper • 2401.04398 • Published Jan 9 • 21
Tulu V2 Suite Collection The set of models associated with the paper "Camels in a Changing Climate: Enhancing LM Adaptation with Tulu 2" • 19 items • Updated 10 days ago • 42