view article Article Welcome Gemma 3: Google's all new multimodal, multilingual, long context open LLM 3 days ago • 244
view article Article PaliGemma 2 Mix - New Instruction Vision Language Models by Google 24 days ago • 65
olmOCR Collection olmOCR is a document recognition pipeline for efficiently converting documents into plain text. olmocr.allenai.org • 3 items • Updated 1 day ago • 93
SYNTHETIC-1 Collection A collection of tasks & verifiers for reasoning datasets • 9 items • Updated 22 days ago • 49
An Open Recipe: Adapting Language-Specific LLMs to a Reasoning Model in One Day via Model Merging Paper • 2502.09056 • Published 30 days ago • 30
view article Article From Chunks to Blocks: Accelerating Uploads and Downloads on the Hub about 1 month ago • 49
Hibiki fr-en Collection Hibiki is a model for streaming speech translation , which can run on device! See https://github.com/kyutai-labs/hibiki. • 5 items • Updated Feb 6 • 50
Qwen2.5-VL Collection Vision-language model series based on Qwen2.5 • 8 items • Updated 19 days ago • 397
view article Article Introducing the Synthetic Data Generator - Build Datasets with Natural Language Dec 16, 2024 • 117
view article Article Train 400x faster Static Embedding Models with Sentence Transformers Jan 15 • 159