Quantizations of https://huggingface.co/meta-llama/Meta-Llama-3-8B

Update (May 1, 2024): re-uploaded models after this merge: https://github.com/ggerganov/llama.cpp/pull/6920 Models now work correctly (tried with 7777+3333 and 3333+777 using Q8_0, both gave correct results)

From original readme

How to use

This repository contains two versions of Meta-Llama-3-8B, for use with transformers and with the original llama3 codebase.

Use with transformers

See the snippet below for usage with Transformers:

>>> import transformers
>>> import torch

>>> model_id = "meta-llama/Meta-Llama-3-8B"

>>> pipeline = transformers.pipeline(
    "text-generation", model=model_id, model_kwargs={"torch_dtype": torch.bfloat16}, device_map="auto"
)
>>> pipeline("Hey how are you doing today?")

Use with `llama3`

Please, follow the instructions in the repository.

To download Original checkpoints, see the example command below leveraging huggingface-cli:

huggingface-cli download meta-llama/Meta-Llama-3-8B --include "original/*" --local-dir Meta-Llama-3-8B

For Hugging Face support, we recommend using transformers or TGI, but a similar command works.

From original readme

How to use

Use with transformers

Use with llama3

Use with `llama3`