How did you convert to gguf?

#3
by scott0x - opened

Hello,

When I have converted to GGUF, I am getting garbage responses.

I have been using llamacpp's GGUF conversion tools.

/root/llamacpp/convert-hf-to-gguf.py hfmodel --outfile model.gguf"

/root/llamacpp/quantize model.gguf model_q4_k_m.gguf Q4_K_M"

/app/gguf-py/scripts/gguf-new-metadata.py model_q4_k_m.gguf model_q4_k_m_with_meta.gguf --special-token prefix '<|fim_prefix|>' --special-token middle '<|fim_middle|>' --special-token suffix '<|fim_suffix|>'

But when I run inferencing on the model, I get garbage responses (random words etc).

Hi @scott0x
Alternatively you can try out this tool: https://huggingface.co/spaces/ggml-org/gguf-my-repo to GGUF any repo out of the box

Google org

Hi @scott0x , Sorry for late response, It looks like you are still facing the same issue with garbage responses after using llamacpp's GGUF conversion tools. Can you please try before quantizing, test the model after the initial GGUF conversion but without applying quantization (Q4_K_M) by using below command.

Screenshot 2024-10-08 at 12.30.43 PM.png

Kindly try and let me know if you have any concerns. Thank you.

Sign up or log in to comment