-
TinyGSM: achieving >80% on GSM8k with small language models
Paper • 2312.09241 • Published • 34 -
Learning From Mistakes Makes LLM Better Reasoner
Paper • 2310.20689 • Published • 24 -
KwaiYiiMath: Technical Report
Paper • 2310.07488 • Published • 2 -
MAmmoTH: Building Math Generalist Models through Hybrid Instruction Tuning
Paper • 2309.05653 • Published • 9
Collections
Discover the best community collections!
Collections including paper arxiv:2310.20689
-
Orca 2: Teaching Small Language Models How to Reason
Paper • 2311.11045 • Published • 69 -
Learning From Mistakes Makes LLM Better Reasoner
Paper • 2310.20689 • Published • 24 -
Let's Verify Step by Step
Paper • 2305.20050 • Published • 3 -
SelfCheck: Using LLMs to Zero-Shot Check Their Own Step-by-Step Reasoning
Paper • 2308.00436 • Published • 20
-
GAIA: a benchmark for General AI Assistants
Paper • 2311.12983 • Published • 175 -
Fine-tuning Language Models for Factuality
Paper • 2311.08401 • Published • 26 -
LayoutPrompter: Awaken the Design Ability of Large Language Models
Paper • 2311.06495 • Published • 9 -
Prompt Engineering a Prompt Engineer
Paper • 2311.05661 • Published • 19
-
Matryoshka Diffusion Models
Paper • 2310.15111 • Published • 39 -
Data Filtering Networks
Paper • 2309.17425 • Published • 6 -
FlashDecoding++: Faster Large Language Model Inference on GPUs
Paper • 2311.01282 • Published • 31 -
E3 TTS: Easy End-to-End Diffusion-based Text to Speech
Paper • 2311.00945 • Published • 11
-
Detecting Pretraining Data from Large Language Models
Paper • 2310.16789 • Published • 9 -
Let's Synthesize Step by Step: Iterative Dataset Synthesis with Large Language Models by Extrapolating Errors from Small Models
Paper • 2310.13671 • Published • 17 -
AutoMix: Automatically Mixing Language Models
Paper • 2310.12963 • Published • 14 -
An Emulator for Fine-Tuning Large Language Models using Small Language Models
Paper • 2310.12962 • Published • 13
-
PaLI-3 Vision Language Models: Smaller, Faster, Stronger
Paper • 2310.09199 • Published • 21 -
A Zero-Shot Language Agent for Computer Control with Structured Reflection
Paper • 2310.08740 • Published • 14 -
Personality Traits in Large Language Models
Paper • 2307.00184 • Published • 19 -
An Emulator for Fine-Tuning Large Language Models using Small Language Models
Paper • 2310.12962 • Published • 13
-
MADLAD-400: A Multilingual And Document-Level Large Audited Dataset
Paper • 2309.04662 • Published • 21 -
Neurons in Large Language Models: Dead, N-gram, Positional
Paper • 2309.04827 • Published • 16 -
Optimize Weight Rounding via Signed Gradient Descent for the Quantization of LLMs
Paper • 2309.05516 • Published • 8 -
DrugChat: Towards Enabling ChatGPT-Like Capabilities on Drug Molecule Graphs
Paper • 2309.03907 • Published • 6