The RefinedWeb Dataset for Falcon LLM: Outperforming Curated Corpora with Web Data, and Web Data Only Paper • 2306.01116 • Published Jun 1, 2023 • 33
What Language Model Architecture and Pretraining Objective Work Best for Zero-Shot Generalization? Paper • 2204.05832 • Published Apr 12, 2022
What Language Model to Train if You Have One Million GPU Hours? Paper • 2210.15424 • Published Oct 27, 2022 • 2
BLOOM: A 176B-Parameter Open-Access Multilingual Language Model Paper • 2211.05100 • Published Nov 9, 2022 • 29