SmolLM2 Collection State-of-the-art compact LLMs for on-device applications: 1.7B, 360M, 135M • 8 items • Updated 3 days ago • 115
🍓 Ichigo Collection The experimental family designed to train LLMs to understand sound natively. • 6 items • Updated 19 days ago • 16
view article Article Releasing Outlines-core 0.1.0: structured generation in Rust and Python 13 days ago • 36
view article Article 🇮🇹🇯🇵🇧🇷 Generating multilingual instruction datasets with Magpie 🐦⬛ By anakin87 • 13 days ago • 18
view article Article How to build a custom text classifier without days of human labeling By sdiazlor • 17 days ago • 54
view article Article How to optimize your data labelling project with custom interfaces By burtenshaw • 18 days ago • 18
view article Article Model2Vec: Distill a Small Fast Model from any Sentence Transformer By Pringled • 20 days ago • 54
Revisit Large-Scale Image-Caption Data in Pre-training Multimodal Foundation Models Paper • 2410.02740 • Published about 1 month ago • 52
Scaling Smart: Accelerating Large Language Model Pre-training with Small Model Initialization Paper • 2409.12903 • Published Sep 19 • 21
Moshi v0.1 Release Collection MLX, Candle & PyTorch model checkpoints released as part of the Moshi release from Kyutai. Run inference via: https://github.com/kyutai-labs/moshi • 13 items • Updated Sep 18 • 213