Running 711 711 FineWeb: decanting the web for the finest text data at scale 🍷 Generate high-quality web text data for LLM training
SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training Paper • 2501.17161 • Published 21 days ago • 106
ModernBERT workhorses. Collection A collection of powerful - but light - models to annotate data. • 3 items • Updated 22 days ago • 1
ModernBERT workhorses. Collection A collection of powerful - but light - models to annotate data. • 3 items • Updated 22 days ago • 1