Error when trying to ask a question - ggml_allocr_alloc: not enough space in the buffer (needed 178227200, largest block available 19333120)

#3
by RajeshkumarV - opened

python main.py "What is the x number?"
ggml_allocr_alloc: not enough space in the buffer (needed 178227200, largest block available 19333120)
GGML_ASSERT: C:\Users\rajesh\AppData\Local\Temp\pip-install-0ohg_aj6\llama-cpp-python_29c4846b4af1471bbb28a41659b32aa3\vendor\llama.cpp\ggml-alloc.c:144: !"not enough space in the buffer"

Try rebooting your PC - your memory seems highly fragmented (or hopelessly filled up)

I have tried to reboot, but it didn't work unfortunately :(

unfortunately, I don't have any experience with the Python version of llama.cpp - I'm using the original C++ variant only, and that has been proven to work. Can you try the original llama.cpp instead?

I am not sure how will that work. I am using this code example in my Windows 11 PC. https://github.com/singlestore-labs/webinar-code-examples/tree/main/llama-2-local

also, i am not sure why i am getting the GGML error, when i am using the GGUF version of model pls

Oh, don't use that - its far too old and hopelessly outdated. You should definitely use the original llama.cpp for GGUF and/or large contexts!!!!

But since the code is in python shouldn't i use the llamacpp python package, instead of the llama.cpp package?

no, llama.cpp has been written in C++, as the name implies

Yes, but this version c++ version was written to run in Mac/linux environment. for windows it will require the python llamacpp package. https://github.com/abetlen/llama-cpp-python

Even then you should use the newest version available, not older than approx. 2 days - that's important because GGUF support is still in the making.

And, since I use Macs only, I can't help you with Windows-specific problems - I'm sorry

Sign up or log in to comment