Yet another change to GGML quant formats...

#2
by Doctor-Shotgun - opened

Running into an error now trying to load these models on the latest llama-cpp-python, and it appears that GGML had (yet another) change to the quant methods around the time these quants were made, that also reduces the size of the models slightly (now 7.32gb for Q4_0 from 8.14gb):
https://github.com/ggerganov/llama.cpp/pull/1508

I was able to find a set of updated quants for pyg13B that run on the current version - but as usual there was no love for metharme lol. Was wondering if you could do another set of quants with the new format - would be much appreciated!

GGML try not to break compatibility challenge... Sigh, I'll update all my GGML files this week, thanks for letting me know

Sign up or log in to comment