GPU acceleration working in Oobagooba Webui

#2
by rombodawg - opened

Idk if they made an update since you made your model card for the airoboros 70b ggml model you posted, but gpu acceleration is working fine in oobagooba in this model, llama-2-70b-guanaco-QLoRA-GGML

error loading model: llama.cpp: tensor 'layers.0.attention.wk.weight' has wrong shape; expected 8192 x 8192, got 8192 x 1024

Got this error. Loaded using llama-cpp-python in Linux. Python3.11, llama-cpp-python 0.1.77

@rombodawg thank you, I've updated my READMEs

@ThamaluM this is probably because you've not passed the -gqa 8 parameter. I don't know how you do that with llama-cpp-python, but there must be a way to do it as text-generation-webui works OK with these models and it uses llama-cpp-python. Check the llama-cpp-python repo and see if there's instructions there.

Thank for the answer it worked.

Sign up or log in to comment