Pretraining Data Mixtures Enable Narrow Model Selection Capabilities in Transformer Models Paper • 2311.00871 • Published Nov 1, 2023 • 2
Data Distributional Properties Drive Emergent In-Context Learning in Transformers Paper • 2205.05055 • Published Apr 22, 2022 • 2
WILBUR: Adaptive In-Context Learning for Robust and Accurate Web Agents Paper • 2404.05902 • Published about 1 month ago • 20
LLM-R2: A Large Language Model Enhanced Rule-based Rewrite System for Boosting Query Efficiency Paper • 2404.12872 • Published 20 days ago • 9
What needs to go right for an induction head? A mechanistic study of in-context learning circuits and their formation Paper • 2404.07129 • Published 29 days ago • 3
pyvene: A Library for Understanding and Improving PyTorch Models via Interventions Paper • 2403.07809 • Published Mar 12 • 1