Can't load q5_1 model

#1
by perelmanych - opened

I have tried to load model with llama AVX2 version and with cublas version but I failed. Llama-cpp is from the latest release. Here is the output

C:\AI\llama>main -i --color --interactive-first -r "### Human:" -r "### Input:" -r "(Input)" -r "### Instruction:" -r "### User:" -r "User:" -r "USER:" -r "=============" --temp 0 --ctx_size 2048 --n_predict -1 --ignore-eos --repeat_penalty 1.2 --instruct -m wizardcoder-guanaco-15b-v1.1.ggmlv1.q5_1.bin --threads 8
main: build = 843 (6e7cca4)
main: seed = 1689451684
llama.cpp: loading model from wizardcoder-guanaco-15b-v1.1.ggmlv1.q5_1.bin
error loading model: unexpectedly reached end of file
llama_load_model_from_file: failed to load model
llama_init_from_gpt_params: error: failed to load model 'wizardcoder-guanaco-15b-v1.1.ggmlv1.q5_1.bin'
main: error: unable to load model

perelmanych changed discussion title from Can't load model to Can't load q5_1 model

From readme:
Compatibilty: These files are not compatible with llama.cpp, text-generation-webui or llama-cpp-python.

Model works for me using ctransformers (https://github.com/marella/ctransformers)

From readme:
Compatibilty: These files are not compatible with llama.cpp, text-generation-webui or llama-cpp-python.

Model works for me using ctransformers (https://github.com/marella/ctransformers)

Oh, sorry. Got used that all ggml models run well on these 3. The problem is that I need openAI API. Kobold has its own API. About the rest I have no idea. TheBloke, can we expect quantized models compatible with those 3 tools?

PS: Just found out that LM Studio should have OpenAI API. So I will try it.

Yeah lm studio is good. And ctransformers can also provide an open AI api I believe.

Sign up or log in to comment