Tulu 3 Datasets Collection All datasets released with Tulu 3 -- state of the art open post-training recipes. • 32 items • Updated 6 days ago • 51
Thinking LLMs: General Instruction Following with Thought Generation Paper • 2410.10630 • Published Oct 14 • 17
Large Language Models Can Self-Improve in Long-context Reasoning Paper • 2411.08147 • Published 21 days ago • 59
Qwen2.5-Coder Collection Code-specific model series based on Qwen2.5 • 40 items • Updated 6 days ago • 240
SmolLM2 Collection State-of-the-art compact LLMs for on-device applications: 1.7B, 360M, 135M • 15 items • Updated 1 day ago • 184
🍓 Ichigo v0.3 Collection The experimental family designed to train LLMs to understand sound natively. • 6 items • Updated 23 days ago • 17
view article Article Releasing Outlines-core 0.1.0: structured generation in Rust and Python Oct 22 • 42
view article Article 🇮🇹🇯🇵🇧🇷 Generating multilingual instruction datasets with Magpie 🐦⬛ By anakin87 • Oct 21 • 18
view article Article How to build a custom text classifier without days of human labeling By sdiazlor • Oct 17 • 55
view article Article How to optimize your data labelling project with custom interfaces By burtenshaw • Oct 16 • 18
view article Article Model2Vec: Distill a Small Fast Model from any Sentence Transformer By Pringled • Oct 14 • 56
Revisit Large-Scale Image-Caption Data in Pre-training Multimodal Foundation Models Paper • 2410.02740 • Published Oct 3 • 52
Scaling Smart: Accelerating Large Language Model Pre-training with Small Model Initialization Paper • 2409.12903 • Published Sep 19 • 21