view article Article Welcome Gemma 4: Frontier multimodal intelligence on device +5 merve, pcuenq, sergiopaniego, burtenshaw, Steveeeeeeen, alvarobartt, SaylorTwift • Apr 2 • 908
SebastianBodza/Kartoffel_Orpheus-3B_german_natural-v0.1 Text-to-Speech • 3B • Updated May 17, 2025 • 651 • 17
unsloth/Mistral-Small-3.2-24B-Instruct-2506-GGUF Image-Text-to-Text • 24B • Updated Aug 26, 2025 • 24.5k • 175
Running on CPU Upgrade Agents Featured 1.37k Open ASR Leaderboard 🏆 1.37k Explore and compare speech recognition model benchmarks
view article Article Mixture of Experts Explained +4 osanseviero, lewtun, philschmid, smangrul, ybelkada, pcuenq • Dec 11, 2023 • 1.14k
DiT: Self-supervised Pre-training for Document Image Transformer Paper • 2203.02378 • Published Mar 4, 2022 • 3
Decision Transformer: Reinforcement Learning via Sequence Modeling Paper • 2106.01345 • Published Jun 2, 2021 • 3
Offline Reinforcement Learning as One Big Sequence Modeling Problem Paper • 2106.02039 • Published Jun 3, 2021 • 2
docling-project/SmolDocling-256M-preview Image-Text-to-Text • 0.3B • Updated Sep 17, 2025 • 17.4k • 1.61k
DocLLM: A layout-aware generative language model for multimodal document understanding Paper • 2401.00908 • Published Dec 31, 2023 • 192