mradermacher
/

Llama-3-8B-Ultra-Instruct-i1-GGUF

Inference Endpoints

Model card Files Files and versions Community

mradermacher commited on Apr 30

Commit

459eecb

•

1 Parent(s): 40d6649

Update README.md

Files changed (1) hide show

README.md +1 -0

README.md CHANGED Viewed

@@ -17,6 +17,7 @@ tags:
 <!-- ### vocab_type:  -->
 weighted/imatrix quants of https://huggingface.co/elinas/Llama-3-8B-Ultra-Instruct
 <!-- provided-files -->
 static quants are available at https://huggingface.co/mradermacher/Llama-3-8B-Ultra-Instruct-GGUF

 <!-- ### vocab_type:  -->
 weighted/imatrix quants of https://huggingface.co/elinas/Llama-3-8B-Ultra-Instruct
+You should use `--override-kv tokenizer.ggml.pre=str:llama3` and a current llama.cpp version to work around a bug in llama.cpp that made these quants. (see https://old.reddit.com/r/LocalLLaMA/comments/1cg0z1i/bpe_pretokenization_support_is_now_merged_llamacpp/?share_id=5dBFB9x0cOJi8vbr-Murh)
 <!-- provided-files -->
 static quants are available at https://huggingface.co/mradermacher/Llama-3-8B-Ultra-Instruct-GGUF