view article Article Welcome Gemma 3: Google's all new multimodal, multilingual, long context open LLM 27 days ago • 376
Phi-4 (All Versions) Collection Microsoft's new Phi-4 models including mini in all formats. Includes GGUF, 4-bit bnb and original versions. Includes Unsloth's bug fixes. • 8 items • Updated 2 days ago • 48
🧠Reasoning datasets Collection Datasets with reasoning traces for math and code released by the community • 20 items • Updated 7 days ago • 122
DeepSeek R1 (All Versions) Collection DeepSeek R1 - the most powerful reasoning open-source model - available in GGUF, original & 4-bit formats. Includes Llama & Qwen distilled models. • 29 items • Updated 2 days ago • 215
Llama 3.2 Collection Meta's new Llama 3.2 vision and text models including 1B, 3B, 11B and 90B. Includes GGUF, 4-bit bnb and original versions. • 27 items • Updated 2 days ago • 60
rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking Paper • 2501.04519 • Published Jan 8 • 275
Dispider: Enabling Video LLMs with Active Real-Time Interaction via Disentangled Perception, Decision, and Reaction Paper • 2501.03218 • Published Jan 6 • 37
view article Article Llama can now see and run on your device - welcome Llama 3.2 Sep 25, 2024 • 187
Takin: A Cohort of Superior Quality Zero-shot Speech Generation Models Paper • 2409.12139 • Published Sep 18, 2024 • 12
WavTokenizer: an Efficient Acoustic Discrete Codec Tokenizer for Audio Language Modeling Paper • 2408.16532 • Published Aug 29, 2024 • 50
FocusLLM: Scaling LLM's Context by Parallel Decoding Paper • 2408.11745 • Published Aug 21, 2024 • 25
LongVILA: Scaling Long-Context Visual Language Models for Long Videos Paper • 2408.10188 • Published Aug 19, 2024 • 52
view article Article Welcome FalconMamba: The first strong attention-free 7B model Aug 12, 2024 • 110