Quantizations of https://huggingface.co/meta-llama/Meta-Llama-3-8B
Update (May 1, 2024): re-uploaded models after this merge: https://github.com/ggerganov/llama.cpp/pull/6920 Models now work correctly (tried with 7777+3333 and 3333+777 using Q8_0, both gave correct results)
From original readme
How to use
This repository contains two versions of Meta-Llama-3-8B, for use with transformers and with the original llama3
codebase.
Use with transformers
See the snippet below for usage with Transformers:
>>> import transformers
>>> import torch
>>> model_id = "meta-llama/Meta-Llama-3-8B"
>>> pipeline = transformers.pipeline(
"text-generation", model=model_id, model_kwargs={"torch_dtype": torch.bfloat16}, device_map="auto"
)
>>> pipeline("Hey how are you doing today?")
Use with llama3
Please, follow the instructions in the repository.
To download Original checkpoints, see the example command below leveraging huggingface-cli
:
huggingface-cli download meta-llama/Meta-Llama-3-8B --include "original/*" --local-dir Meta-Llama-3-8B
For Hugging Face support, we recommend using transformers or TGI, but a similar command works.
- Downloads last month
- 176