LayerSkip Collection Models continually pretrained using LayerSkip - https://arxiv.org/abs/2404.16710 • 7 items • Updated 3 days ago • 17
HelpSteer2-Preference: Complementing Ratings with Preferences Paper • 2410.01257 • Published 20 days ago • 15
Llama-3.1-Nemotron-70B Collection SOTA models on Arena Hard and RewardBench as of 1 Oct 2024. • 6 items • Updated 7 days ago • 112
Gemma-APS Release Collection Gemma models for text-to-propositions segmentation. The models are distilled from fine-tuned Gemini Pro model applied to multi-domain synthetic data. • 3 items • Updated 7 days ago • 18
Load 4bit models 4x faster Collection Native bitsandbytes 4bit pre quantized models • 25 items • Updated 17 days ago • 48
Llama 3.2 All Versions Collection Meta's new Llama 3.2 vision and text models including 1B, 3B, 11B and 90B. Includes GGUF, 4-bit bnb and original versions. • 20 items • Updated 16 days ago • 33
Llama-3.2 Quantization Collection Llama 3.2 models quantized by Neural Magic • 9 items • Updated 26 days ago • 6
Llama 3.2 Evals Collection This collection provides detailed information on how we derived the reported benchmark metrics for the Llama 3.2 models, including the configurations • 4 items • Updated 27 days ago • 19
Llama 3.2 Collection This collection hosts the transformers and original repos of the Llama 3.2 and Llama Guard 3 • 11 items • Updated 27 days ago • 386
MiniCheck & LLM-AggreFact Collection MiniCheck: Efficient Fact-Checking of LLMs on Grounding Documents • 6 items • Updated Aug 9 • 4
Qwen2.5-Math Collection Math-specific model series based on Qwen2.5 • 9 items • Updated 29 days ago • 38
Qwen2.5-Coder Collection Code-specific model series based on Qwen2.5 • 14 items • Updated 27 days ago • 76
Qwen2.5 Collection Qwen2.5 language models, including pretrained and instruction-tuned models of 7 sizes, including 0.5B, 1.5B, 3B, 7B, 14B, 32B, and 72B. • 45 items • Updated Sep 18 • 268
Moshi v0.1 Release Collection MLX, Candle & PyTorch model checkpoints released as part of the Moshi release from Kyutai. Run inference via: https://github.com/kyutai-labs/moshi • 13 items • Updated Sep 18 • 211
DataGemma Release Collection A series of pioneering open models that help ground LLMs in real-world data through Data Commons. • 2 items • Updated Sep 12 • 76
XGen-MM-1 models and datasets Collection A collection of all XGen-MM (Foundation LMM) models! • 14 items • Updated 14 days ago • 34
xLAM models Collection xLAM: A Family of Large Action Models to Empower AI Agent Systems: https://github.com/SalesforceAIResearch/xLAM • 9 items • Updated 14 days ago • 41
Jamba-1.5 Collection The AI21 Jamba family of models are state-of-the-art, hybrid SSM-Transformer instruction following foundation models • 2 items • Updated Aug 22 • 80
Parler-TTS: fully open-source high-quality TTS Collection If you want to find out more about how these models were trained and even fine-tune them yourself, check-out the Parler-TTS repository on GitHub. • 7 items • Updated Aug 8 • 44
Meta Llama 3 Collection This collection hosts the transformers and original repos of the Meta Llama 3 and Llama Guard 2 releases • 5 items • Updated 27 days ago • 680
Llama 3.1 Collection This collection hosts the transformers and original repos of the Llama 3.1, Llama Guard 3 and Prompt Guard models • 11 items • Updated 27 days ago • 597
Arctic-embed Collection A collection of text embedding models optimized for retrieval accuracy and efficiency • 6 items • Updated Jul 18 • 14
CogVLM2 Collection This collection hosts the repos of the THUDM's CogVLM2 releases • 8 items • Updated Aug 18 • 17
LLM Compiler Collection Meta LLM Compiler is a state-of-the-art LLM that builds upon Code Llama with improved performance for code optimization and compiler reasoning. • 4 items • Updated Jun 27 • 147
Qwen2 Collection Qwen2 language models, including pretrained and instruction-tuned models of 5 sizes, including 0.5B, 1.5B, 7B, 57B-A14B, and 72B. • 39 items • Updated Sep 18 • 343
abliterated-v3 Collection Latest gen of the abliterated models I've produced • 17 items • Updated Jun 3 • 94
Phi-3 Collection Phi-3 family of small language and multi-modal models. Language models are available in short- and long-context lengths. • 27 items • Updated Sep 18 • 480
Gemma release Collection Groups the Gemma models released by the Google team. • 40 items • Updated Jul 31 • 325
PaliGemma Release Collection Pretrained and mix checkpoints for PaliGemma • 16 items • Updated Jul 31 • 137