How do I convert v0 to v1 for the new llama.cpp?

#1
by jimaldon - opened

The current v0 is incompatible with llama.cpp

Oops I completely forgot about this one. I'll do it later today.

You'll need the huggingface converted pytorch files. And then merge them into a single file. There should be a script for hf to pth. Then convert the pth file into a ggml f32 file (option 0). Then quantize it to q4_1 (option 3)

Any updates on the q4_1 model?

Sign up or log in to comment