DeepHermes Collection Preview models of hybrid reasoner Hermes series • 6 items • Updated Mar 13 • 27
Yuan 2.0-M32: Mixture of Experts with Attention Router Paper • 2405.17976 • Published May 28, 2024 • 22
Every FLOP Counts: Scaling a 300B Mixture-of-Experts LING LLM without Premium GPUs Paper • 2503.05139 • Published Mar 7 • 2
M1: Towards Scalable Test-Time Compute with Mamba Reasoning Models Paper • 2504.10449 • Published 10 days ago • 10
Ovis2 Collection Our latest advancement in multi-modal large language models (MLLMs) • 15 items • Updated 30 days ago • 59
Nemotron 3 8B Collection The Nemotron 3 8B Family of models is optimized for building production-ready generative AI applications for the enterprise. • 5 items • Updated about 23 hours ago • 49
OpenCodeReasoning Collection Reasoning data for supervised finetuning of LLMs to advance data distillation for competitive coding • 5 items • Updated about 23 hours ago • 7
Granite 3.3 Language Models Collection Our latest language models licensed under Apache 2.0 license. • 4 items • Updated 8 days ago • 30
BitNet Collection 🔥BitNet family of large language models (1-bit LLMs). • 6 items • Updated 6 days ago • 28
GenPRM Collection A collection of GenPRM. Project page: https://ryanliu112.github.io/GenPRM • 6 items • Updated 18 days ago • 5