MaziyarPanahi
/

Llama-3-16B-Instruct-v0.1-GGUF

Text Generation

4-bit precision

8-bit precision

Model card Files Files and versions Community

Llama-3-16B-Instruct-v0.1-GGUF / README.md

MaziyarPanahi's picture

Update README.md (#3)

a0f645b verified 7 months ago

|

history blame contribute delete

1.55 kB

	---
	tags:
	- quantized
	- 2-bit
	- 3-bit
	- 4-bit
	- 5-bit
	- 6-bit
	- 8-bit
	- GGUF
	- text-generation
	- mixtral
	- text-generation
	model_name: Llama-3-16B-Instruct-v0.1-GGUF
	base_model: MaziyarPanahi/Llama-3-16B-Instruct-v0.1
	inference: false
	model_creator: MaziyarPanahi
	pipeline_tag: text-generation
	quantized_by: MaziyarPanahi
	---
	# [MaziyarPanahi/Llama-3-16B-Instruct-v0.1-GGUF](https://huggingface.co/MaziyarPanahi/Llama-3-16B-Instruct-v0.1-GGUF)
	- Model creator: [MaziyarPanahi](https://huggingface.co/MaziyarPanahi)
	- Original model: [MaziyarPanahi/Llama-3-16B-Instruct-v0.1](https://huggingface.co/MaziyarPanahi/Llama-3-16B-Instruct-v0.1)

	## Description
	[MaziyarPanahi/Llama-3-16B-Instruct-v0.1-GGUF](https://huggingface.co/MaziyarPanahi/Llama-3-16B-Instruct-v0.1-GGUF) contains GGUF format model files for [MaziyarPanahi/Llama-3-16B-Instruct-v0.1](https://huggingface.co/MaziyarPanahi/Llama-3-16B-Instruct-v0.1).

	## Load GGUF models

	You `MUST` follow the prompt template provided by Llama-3:


	```sh
	./llama.cpp/main -m Llama-3-11B-Instruct.Q2_K.gguf -r '<\|eot_id\|>' --in-prefix "\n<\|start_header_id\|>user<\|end_header_id\|>\n\n" --in-suffix "<\|eot_id\|><\|start_header_id\|>assistant<\|end_header_id\|>\n\n" -p "<\|begin_of_text\|><\|start_header_id\|>system<\|end_header_id\|>\n\nYou are a helpful, smart, kind, and efficient AI assistant. You always fulfill the user's requests to the best of your ability.<\|eot_id\|>\n<\|start_header_id\|>user<\|end_header_id\|>\n\nHi! How are you?<\|eot_id\|>\n<\|start_header_id\|>assistant<\|end_header_id\|>\n\n" -n 1024
	```