llama.cpp unable to load the .gguf model

#3
by hrud - opened

I am using Python 3.8.5, latestllama.cpp(commit id: 4aea3b846ec151cc6d08f93a8889eae13b286b06)
I downloaded the models using git lfs from TheBloke as git lfs clone https://huggingface.co/TheBloke/Llama-2-13B-chat-GGUF

bash-4.2$ ./main -ngl 32 -m ../models/Llama-2-13B-chat-GGUF/llama-2-13b-chat.Q2_K.gguf --color -c 4096 --temp 0.7 --repeat_penalty 1.1 -n -1 -p "[INST] <<SYS>>\nYou are a helpful, respectful and honest assistant. Always answer as helpfully as possible, while being safe.  Your answers should not include any harmful, unethical, racist, sexist, toxic, dangerous, or illegal content. Please ensure that your responses are socially unbiased and positive in nature. If a question does not make any sense, or is not factually coherent, explain why instead of answering something not correct. If you don't know the answer to a question, please don't share false information.\n<</SYS>>\n{prompt}[/INST]"
warning: not compiled with GPU offload support, --n-gpu-layers option will be ignored
warning: see main README.md for information on enabling GPU BLAS support
Log start
main: warning: changing RoPE frequency base to 0 (default 10000.0)
main: warning: scaling RoPE frequency by 0 (default 1.0)
main: build = 1281 (4aea3b8)
main: built with cc (GCC) 7.3.1 20180712 (Red Hat 7.3.1-15) for x86_64-redhat-linux
main: seed  = 1695905013
gguf_init_from_file: invalid magic number 73726576
error loading model: llama_model_loader: failed to load model from ../models/Llama-2-13B-chat-GGUF/llama-2-13b-chat.Q2_K.gguf

llama_load_model_from_file: failed to load model
llama_init_from_gpt_params: error: failed to load model '../models/Llama-2-13B-chat-GGUF/llama-2-13b-chat.Q2_K.gguf'
main: error: unable to load model

I was able to fix this by manually downloading the model file. Somehow git lfs is not downloading the complete file.

As discussed in the Readme, I strongly discourage anyone from using Git to download files from HF, and especially GGUF model files

I have downloaded the model 'llama-2-13b-chat.Q8_0.gguf' from HF. Still, I am unable to load the model using Llama from llama_cpp.

@naina28-03 or @hrud were you able to fix the issue? i manually downloaded gguf. it loaded fine for inference. but when i tried to split and back to merge it have issue during the merge with similar error regarding magic character

Sign up or log in to comment