Language models scale reliably with over-training and on downstream tasks Paper • 2403.08540 • Published Mar 13 • 13