qwp4w3hyb/Meta-Llama-3-8B-Instruct-iMat-GGUF · Update: iMatrix Wasn't The Cause Of The Hallucinations

deleted

May 8, 2024

•

edited May 8, 2024

Edit: Never mind. The hallucinations I was seeing was not due to iMatrix.

Owner May 8, 2024

The dataset used does not just contain coding, it ALSO contains coding, it's a mixture of different dataset from different domains, to avoid exactly the problem you are getting at.

Increased Hallucination is a general problem with quantization and Llama-3 seems to be more affected by quantization than other previous models, probably because it got trained on so many more tokens so it's knowledge is more "dense" so it's easier to loose some smarts when quantizing.

I found these quants, which do not mention anything about imatrix:

https://huggingface.co/QuantFactory/Meta-Llama-3-8B-Instruct-GGUF

Although they are quite old, so they probably do not include the latest bpe pre-tokenizer fixes in llama.cpp.

If you find non-imatrix quants and they work better I would love to hear more details, but for now I would predict you would have similar problems with non-imatrix quants.

deleted

May 8, 2024

@qwp4w3hyb You are right. I've since resolved this issue and forgot to come back and close this discussion. The hallucinations went away when using a recent GGUF of L3 8b Instruct with the latest version of llama.cpp using Kobaldcpp.

deleted changed discussion status to closed May 8, 2024

deleted changed discussion title from None iMatrix Versions Anywhere? to Update: iMatrix Wasn't The Cause The Hallucinations May 8, 2024

deleted changed discussion title from Update: iMatrix Wasn't The Cause The Hallucinations to Update: iMatrix Wasn't The Cause Of The Hallucinations May 8, 2024