llama.cpp tokenization bug

#1
by FlareRebellion - opened

This GGUF and other's derived from llama3 models is probably affected by

https://github.com/ggerganov/llama.cpp/pull/6920

Can you recreate your quantization with the fixed commit?

@FlareRebellion

If you use llamacpp you can try this

About this:

https://huggingface.co/ChaoticNeutrals/Poppy_Porpoise-v0.7-L3-8B/discussions/5#662fd94d83386c29d18fc140

I'll redo them as per demand, at least the most popular and well performing models, and will add a Notice to the ones that still need to be updated. KoboldCpp has to get upstream features for its users to be able fo actually benefit from the fixes and there's still a potential issue to be solved:

https://github.com/ggerganov/llama.cpp/issues/6914

Can you recreate your quantization with the fixed commit?

@FlareRebellion - Will do these quants again and reupload. You'll still have to wait for KCPP 1.64 release to get the benefits but quants will at least already be ready.

Lewdiculous changed discussion status to closed

Issues seem to getting fixed already, using latest llamacpp:
https://github.com/ggerganov/llama.cpp/issues/6914#issuecomment-2084315900

Facing issues with Aurora's tokenizer... Will wait some more to look into it, might be another issue.

Lewdiculous changed discussion status to open

@FlareRebellion For now I'll recommend you check out https://huggingface.co/Lewdiculous/Chaos_RP_l3_8B-GGUF-IQ-Imatrix, which should be as good as Aurora or better and I was able to re-quant it properly. I'll talk to the author about Aurora.

Lewdiculous changed discussion status to closed

Sign up or log in to comment