llama.cpp breaks quantized ggml file format

#11
by Waldschrat - opened

llama.cpp decided to break the quantized ggml file format: https://github.com/ggerganov/llama.cpp/pull/1305

As nobody seems to be able (or willing) to provide a conversion script, the models need to be requantized (is that even a word?) from the source models.

As this is quite a hurdle for people new into the field (like me), so: May I ask you to please quantize and upload the models in the new format?

Updated the quants and added q5_0

Might have to be updated again heh

Updating again today...

Sign up or log in to comment