How did you convert to gguf?
#3
by
scott0x
- opened
Hello,
When I have converted to GGUF, I am getting garbage responses.
I have been using llamacpp's GGUF conversion tools.
/root/llamacpp/convert-hf-to-gguf.py hfmodel --outfile model.gguf"
/root/llamacpp/quantize model.gguf model_q4_k_m.gguf Q4_K_M"
/app/gguf-py/scripts/gguf-new-metadata.py model_q4_k_m.gguf model_q4_k_m_with_meta.gguf --special-token prefix '<|fim_prefix|>' --special-token middle '<|fim_middle|>' --special-token suffix '<|fim_suffix|>'
But when I run inferencing on the model, I get garbage responses (random words etc).
Hi
@scott0x
Alternatively you can try out this tool: https://huggingface.co/spaces/ggml-org/gguf-my-repo to GGUF any repo out of the box
Hi
@scott0x
, Sorry for late response, It looks like you are still facing the same issue with garbage responses after using llamacpp's GGUF
conversion tools. Can you please try before quantizing, test the model after the initial GGUF conversion but without applying quantization (Q4_K_M) by using below command.
Kindly try and let me know if you have any concerns. Thank you.