|
--- |
|
tags: |
|
- quantized |
|
- 2-bit |
|
- 3-bit |
|
- 4-bit |
|
- 5-bit |
|
- 6-bit |
|
- 8-bit |
|
- GGUF |
|
- text-generation |
|
- mixtral |
|
- text-generation |
|
model_name: Llama-3-16B-Instruct-v0.1-GGUF |
|
base_model: MaziyarPanahi/Llama-3-16B-Instruct-v0.1 |
|
inference: false |
|
model_creator: MaziyarPanahi |
|
pipeline_tag: text-generation |
|
quantized_by: MaziyarPanahi |
|
--- |
|
# [MaziyarPanahi/Llama-3-16B-Instruct-v0.1-GGUF](https://huggingface.co/MaziyarPanahi/Llama-3-16B-Instruct-v0.1-GGUF) |
|
- Model creator: [MaziyarPanahi](https://huggingface.co/MaziyarPanahi) |
|
- Original model: [MaziyarPanahi/Llama-3-16B-Instruct-v0.1](https://huggingface.co/MaziyarPanahi/Llama-3-16B-Instruct-v0.1) |
|
|
|
## Description |
|
[MaziyarPanahi/Llama-3-16B-Instruct-v0.1-GGUF](https://huggingface.co/MaziyarPanahi/Llama-3-16B-Instruct-v0.1-GGUF) contains GGUF format model files for [MaziyarPanahi/Llama-3-16B-Instruct-v0.1](https://huggingface.co/MaziyarPanahi/Llama-3-16B-Instruct-v0.1). |
|
|
|
## Load GGUF models |
|
|
|
You `MUST` follow the prompt template provided by Llama-3: |
|
|
|
|
|
```sh |
|
./llama.cpp/main -m Llama-3-11B-Instruct.Q2_K.gguf -r '<|eot_id|>' --in-prefix "\n<|start_header_id|>user<|end_header_id|>\n\n" --in-suffix "<|eot_id|><|start_header_id|>assistant<|end_header_id|>\n\n" -p "<|begin_of_text|><|start_header_id|>system<|end_header_id|>\n\nYou are a helpful, smart, kind, and efficient AI assistant. You always fulfill the user's requests to the best of your ability.<|eot_id|>\n<|start_header_id|>user<|end_header_id|>\n\nHi! How are you?<|eot_id|>\n<|start_header_id|>assistant<|end_header_id|>\n\n" -n 1024 |
|
``` |
|
|