Edit model card

Meta-Llama-3-8B-Instruct GGUF

Original model: Meta-Llama-3-8B-Instruct

Model creator: Meta

Meta developed and released the Meta Llama 3 family of large language models (LLMs), a collection of pretrained and instruction tuned generative text models in 8 and 70B sizes. The Llama 3 instruction tuned models are optimized for dialogue use cases and outperform many of the available open source chat models on common industry benchmarks. Further, in developing these models, we took great care to optimize helpfulness and safety.

Llama 3 is an auto-regressive language model that uses an optimized transformer architecture. The tuned versions use supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) to align with human preferences for helpfulness and safety.

This repo contains GGUF format model files for Meta’s Llama-3-8B-Instruct, updated as of 2024-04-29 to incorporate tokenization improvements, as well as previous interventions to handle the <|eot_id|> special token as EOS token.

Learn more on Meta’s Llama 3 page.

What is GGUF?

GGUF is a file format for representing AI models. It is the third version of the format, introduced by the llama.cpp team on August 21st 2023. It is a replacement for GGML, which is no longer supported by llama.cpp. Converted with llama.cpp build 2763 (revision ffe666), using autogguf.

Prompt template

<|start_header_id|>system<|end_header_id|>

{{system_prompt}}<|eot_id|><|start_header_id|>user<|end_header_id|>

{{prompt}}<|eot_id|><|start_header_id|>assistant<|end_header_id|>


Download & run with cnvrs on iPhone, iPad, and Mac!

cnvrs.ai

cnvrs is the best app for private, local AI on your device:

  • create & save Characters with custom system prompts & temperature settings
  • download and experiment with any GGUF model you can find on HuggingFace!
  • make it your own with custom Theme colors
  • powered by Metal ⚡️ & Llama.cpp, with haptics during response streaming!
  • try it out yourself today, on Testflight!
  • follow cnvrs on twitter to stay up to date

Original Model Evaluation

Benchmark Llama 3 8B Llama 2 7B Llama 2 13B
MMLU (5-shot) 68.4 34.1 47.8
GPQA (0-shot) 34.2 21.7 22.3
HumanEval (0-shot) 62.2 7.9 14.0
GSM-8K (8-shot, CoT) 79.6 25.7 77.4
MATH (4-shot, CoT) 30.0 3.8 6.7
Downloads last month
14,254
GGUF
Model size
8.03B params
Architecture
llama
+5
Inference Examples
Inference API (serverless) has been turned off for this model.

Quantized from