-
The Goldilocks of Pragmatic Understanding: Fine-Tuning Strategy Matters for Implicature Resolution by LLMs
Paper • 2210.14986 • Published • 4 -
Camels in a Changing Climate: Enhancing LM Adaptation with Tulu 2
Paper • 2311.10702 • Published • 18 -
Large Language Models as Optimizers
Paper • 2309.03409 • Published • 73 -
From Sparse to Dense: GPT-4 Summarization with Chain of Density Prompting
Paper • 2309.04269 • Published • 29
Collections
Discover the best community collections!
Collections including paper arxiv:2403.09629
-
google/flan-t5-large
Text2Text Generation • Updated • 1.6M • • 505 -
deepseek-ai/deepseek-coder-6.7b-instruct
Text Generation • Updated • 145k • 328 -
Object Recognition as Next Token Prediction
Paper • 2312.02142 • Published • 11 -
colbert-ir/dspy-Oct11-T5-Large-MH-3k-v1
Text2Text Generation • Updated • 12 • 1
-
OpenMathInstruct-1: A 1.8 Million Math Instruction Tuning Dataset
Paper • 2402.10176 • Published • 33 -
Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models
Paper • 2402.19427 • Published • 50 -
Beyond Language Models: Byte Models are Digital World Simulators
Paper • 2402.19155 • Published • 47 -
Matryoshka Representation Learning
Paper • 2205.13147 • Published • 7
-
Ada-Instruct: Adapting Instruction Generators for Complex Reasoning
Paper • 2310.04484 • Published • 4 -
Diversity of Thought Improves Reasoning Abilities of Large Language Models
Paper • 2310.07088 • Published • 4 -
Adapting Large Language Models via Reading Comprehension
Paper • 2309.09530 • Published • 74 -
Democratizing Reasoning Ability: Tailored Learning from Large Language Model
Paper • 2310.13332 • Published • 14
-
Large Language Models as Optimizers
Paper • 2309.03409 • Published • 73 -
Challenges and Applications of Large Language Models
Paper • 2307.10169 • Published • 47 -
Efficiently Modeling Long Sequences with Structured State Spaces
Paper • 2111.00396 • Published • 1 -
DreamCoder: Growing generalizable, interpretable knowledge with wake-sleep Bayesian program learning
Paper • 2006.08381 • Published