4M Tokenizers Collection Multimodal tokenizers from https://4m.epfl.ch/ • 12 items • Updated 18 days ago • 3
LLM Compiler Collection Meta LLM Compiler is a state-of-the-art LLM that builds upon Code Llama with improved performance for code optimization and compiler reasoning. • 4 items • Updated 6 days ago • 128
Nemotron 4 340B Collection Nemotron-4: open models for Synthetic Data Generation (SDG). Includes Base, Instruct, and Reward models. • 4 items • Updated 19 days ago • 147
PaliGemma Release Collection Pretrained and mix checkpoints for PaliGemma • 16 items • Updated 6 days ago • 119
Qwen2 Collection Qwen2 language models, including pretrained and instruction-tuned models of 5 sizes, including 0.5B, 1.5B, 7B, 57B-A14B, and 72B. • 29 items • Updated 27 days ago • 226
view article Article Training and Finetuning Embedding Models with Sentence Transformers v3 May 28 • 115
Granite Code Models Collection A series of code models trained by IBM licensed under Apache 2.0 license. We release both the base pretrained and instruct models. • 20 items • Updated 4 days ago • 145
Chinese Llama-3 series Collection This collection hosts the LLMs of Chinese-LLaMA-Alpaca-3 project, including Llama-3-Chinese, Llama-3-Chinese-Instruct, etc. • 12 items • Updated May 30 • 11
Gemma release Collection Groups the Gemma models released by the Google team. • 40 items • Updated 6 days ago • 318
PersonaLLM: Investigating the Ability of Large Language Models to Express Personality Traits Paper • 2305.02547 • Published May 4, 2023 • 7
view article Article Run the strongest open-source LLM model: Llama3 70B with just a single 4GB GPU! By lyogavin • Apr 21 • 40
OpenELM: An Efficient Language Model Family with Open-source Training and Inference Framework Paper • 2404.14619 • Published Apr 22 • 124
Phi-3 Collection Phi-3 family of small language and multi-modal models. Language models are available in short- and long-context lengths. • 22 items • Updated May 31 • 355
Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone Paper • 2404.14219 • Published Apr 22 • 240
Meta Llama 3 Collection This collection hosts the transformers and original repos of the Meta Llama 3 and Llama Guard 2 releases • 5 items • Updated Apr 18 • 618
DBRX Collection DBRX is a mixture-of-experts (MoE) large language model trained from scratch by Databricks. • 3 items • Updated Mar 27 • 90
Breeze-7B Collection Breeze-7B is a language model family that builds on top of Mistral-7B, specifically intended for Traditional Chinese use. • 8 items • Updated Apr 25 • 9
⭐ StarCoder Collection All models, datasets, and demos related to StarCoder! • 11 items • Updated Feb 27 • 20