Could you quant maldv/badger-iota-llama-3-8b?

#63
by maldv - opened

I actually tried this morning but failed, let me re-run it to see what the problem was.

not supported by llama.cpp, but maybe I can work around it

the pretokenizer is not supported by llama.cpp, I am forcing it to llama3, which is hopefully the closest match. quants should be incoming soon. Cheers!

mradermacher changed discussion status to closed

Thanks man. I'll have to dig into this as well and see if I have something wrong in a config. I know when I rope scaled with 'dynamic' it would break lcpp, so I wonder if there is something lingering from that.

It's merely the pretokenizer, not the rope config. I don't know exactly whats wrong, but one possibility is that othe model you got the tokenizer config for is based on llama-3 before the applied a fix to their repo. Or after they applied a fix. (I think https://huggingface.co/meta-llama/Meta-Llama-3-8B hashed to 0ef9807a4087ebef797fc749390439009c3b9eda9ad1a097abbe738f486c01e5, which is what llama.cpp uses, and now it's c136ed14d01c2745d4f60a9596ae66800e2b61fa45643e72436041855ad4089d).

As such, everything might be fine :)

I see there is a bug report open for this (https://github.com/ggerganov/llama.cpp/issues/7069), but I guess it's being ignored, as llama.cpp thinks only "important" models matter.

Sign up or log in to comment