Pretrain Data HuggingFaceTB/smollm-corpus Viewer • Updated Sep 6 • 237M • 14.3k • 245 HuggingFaceFW/fineweb-edu-classifier Text Classification • Updated 5 days ago • 192k • 132 HuggingFaceFW/fineweb Viewer • Updated Jul 16 • 46B • 384k • 1.75k togethercomputer/RedPajama-Data-V2 Updated about 23 hours ago • 3.97k • 348