Orpheus Multilingual Research Release Collection Beta Release of multilingual models. • 12 items • Updated 19 days ago • 77
🧠 Reasoning datasets Collection Datasets with reasoning traces for math and code released by the community • 21 items • Updated 14 days ago • 132
👩💻 OlympicCoder Collection Reasoning datasets and models for competitive coding • 4 items • Updated Mar 11 • 16
olmOCR Collection olmOCR is a document recognition pipeline for efficiently converting documents into plain text. olmocr.allenai.org • 4 items • Updated Mar 19 • 107
Exploring the Limit of Outcome Reward for Learning Mathematical Reasoning Paper • 2502.06781 • Published Feb 10 • 61
SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model Paper • 2502.02737 • Published Feb 4 • 228
WildChat-50m Collection All model responses associated with the WildChat-50m paper. • 55 items • Updated Jan 29 • 8