Pretrain Data HuggingFaceTB/smollm-corpus Viewer • Updated Sep 6, 2024 • 237M • 8.75k • 298 HuggingFaceFW/fineweb-edu-classifier Text Classification • Updated Nov 17, 2024 • 50.9k • • 164 HuggingFaceFW/fineweb Viewer • Updated 19 days ago • 25B • 327k • 1.97k togethercomputer/RedPajama-Data-V2 Updated Nov 21, 2024 • 2.59k • 358