Running 919 919 FineWeb: decanting the web for the finest text data at scale 🍷 Generate high-quality web text data for LLM training
Sleep-time Compute: Beyond Inference Scaling at Test-time Paper • 2504.13171 • Published 6 days ago • 14
MobileLLM Collection Optimizing Sub-billion Parameter Language Models for On-Device Use Cases (ICML 2024) https://arxiv.org/abs/2402.14905 • 9 items • Updated Nov 27, 2024 • 113