view article Article You could have designed state of the art positional encoding Nov 25, 2024 • 201
Gemma 3 QAT Collection Quantization Aware Trained (QAT) Gemma 3 checkpoints. The model preserves similar quality as half precision while using 3x less memory • 8 items • Updated about 16 hours ago • 58
view article Article Training and Finetuning Reranker Models with Sentence Transformers v4 9 days ago • 93
MoCha: Towards Movie-Grade Talking Character Synthesis Paper • 2503.23307 • Published 5 days ago • 65
CoRNStack Collection State-of-the-art code retrieval and re-ranking models and datasets • 9 items • Updated 8 days ago • 15
Qwen2.5-Omni Collection End-to-End Omni (text, audio, image, video, and natural speech interaction) model based Qwen2.5 • 3 items • Updated 8 days ago • 76
TxGemma Release Collection Collection of open models to accelerate the development of therapeutics. • 5 items • Updated about 16 hours ago • 39
💫StarVector Models Collection StarVector is a multimodal LLM for Scalable Vector Graphics (SVG) generation, producing structured SVG code directly from images and text. • 2 items • Updated 14 days ago • 90
MoshiVis v0.1 Collection MoshiVis is a Vision Speech Model built as a perceptually-augmented version of Moshi v0.1 for conversing about image inputs • 8 items • Updated 14 days ago • 20
DeepHermes Collection Preview models of hybrid reasoner Hermes series • 6 items • Updated 21 days ago • 27
Llama Nemotron Collection Open, Production-ready Enterprise Models • 3 items • Updated about 3 hours ago • 25
EXAONE-Deep Collection EXAONE reasoning model series of 2.4B, 7.8B, and 32B, optimized for reasoning tasks including math and coding • 9 items • Updated 17 days ago • 85