Will it be converted to ggml q4?

#1
by ai2p - opened

to run with llama.cpp

ai2p changed discussion title from Will me ggml q4 version? to Will it be converted to ggml q4?

I've done a GPTQ 4bit version for GPU inference here: https://huggingface.co/TheBloke/medalpaca-13B-GPTQ-4bit

Tomorrow I'll look at GGMLs also

I tried to convert with llama.cpp utils but get an error.

Oh yeah I forgot about this. I'll see what I can do.

Sorry. Finally I did it just removing the two optimizer/scheduler .pt

I have done a set of GGMLs here, using latest llama.cpp: https://huggingface.co/TheBloke/medalpaca-13B-GGML

Sign up or log in to comment