MoshiVis v0.1 Collection MoshiVis is a Vision Speech Model built as a perceptually-augmented version of Moshi v0.1 for conversing about image inputs • 8 items • Updated 17 days ago • 22
MambaVision Collection MambaVision: A Hybrid Mamba-Transformer Vision Backbone. Includes both 1K and 21K pretrained models. • 13 items • Updated 3 days ago • 31
view article Article Welcome Gemma 3: Google's all new multimodal, multilingual, long context open LLM 26 days ago • 376
Babel Collection Open Multilingual Large Language Models Serving Over 90% of Global Speakers • 7 items • Updated 5 days ago • 17
Phi-4 Collection Phi-4 family of small language and multi-modal models. • 7 items • Updated Mar 3 • 113
Ovis2 Collection Our latest advancement in multi-modal large language models (MLLMs) • 15 items • Updated 13 days ago • 59
Dolphin 3.0 Collection Dolphin 3.0 is the next generation of the Dolphin series of instruct-tuned models. Designed to be the ultimate general purpose local model. • 9 items • Updated Feb 7 • 122
Qwen2.5-1M Collection The long-context version of Qwen2.5, supporting 1M-token context lengths • 3 items • Updated Feb 26 • 111
Qwen2.5-VL Collection Vision-language model series based on Qwen2.5 • 11 items • Updated 7 days ago • 436
SVGFusion: Scalable Text-to-SVG Generation via Vector Space Diffusion Paper • 2412.10437 • Published Dec 11, 2024 • 4
Falcon3 Collection Falcon3 family of Open Foundation Models is a set of pretrained and instruct LLMs ranging from 1B to 10B parameters. • 40 items • Updated Feb 13 • 84
PaliGemma 2 Release Collection Vision-Language Models available in multiple 3B, 10B and 28B variants. • 32 items • Updated 4 days ago • 146
MobileLLM Collection Optimizing Sub-billion Parameter Language Models for On-Device Use Cases (ICML 2024) https://arxiv.org/abs/2402.14905 • 9 items • Updated Nov 27, 2024 • 111