Files changed (1) hide show
  1. README.md +5 -3
README.md CHANGED
@@ -36,12 +36,14 @@ You can download only the quants you need instead of cloning the entire reposito
36
  huggingface-cli download MaziyarPanahi/Meta-Llama-3-70B-Instruct-GGUF --local-dir . --include '*Q2_K*gguf'
37
  ```
38
 
39
- ## Load sharded model
 
 
40
 
41
- `llama_load_model_from_file` will detect the number of files and will load additional tensors from the rest of files.
42
 
43
  ```sh
44
- llama.cpp/main -m Meta-Llama-3-70B-Instruct.Q2_K-00001-of-00005.gguf -p "Building a website can be done in 10 simple steps:\nStep 1:" -n 1024 -e
 
45
  ```
46
 
47
 
 
36
  huggingface-cli download MaziyarPanahi/Meta-Llama-3-70B-Instruct-GGUF --local-dir . --include '*Q2_K*gguf'
37
  ```
38
 
39
+ ## Load GGUF models
40
+
41
+ You `MUST` follow the prompt template provided by Llama-3:
42
 
 
43
 
44
  ```sh
45
+ ./llama.cpp/main -m Meta-Llama-3-70B-Instruct.Q2_K.gguf -r '<|eot_id|>' --in-prefix "\n<|start_header_id|>user<|end_header_id|>\n\n" --in-suffix "<|eot_id|><|start_header_id|>assistant<|end_header_id|>
46
+ \n\n" -p "<|begin_of_text|><|start_header_id|>system<|end_header_id|>\n\nYou are a helpful, smart, kind, and efficient AI assistant. You always fulfill the user's requests to the best of your ability.<|eot_id|>\n<|start_header_id|>user<|end_header_id|>\n\nHi!<|eot_id|>\n<|start_header_id|>assistant<|end_header_id|>\n\n" -n 1024
47
  ```
48
 
49