GPU acceleration working in Oobagooba Webui

by rombodawg - opened Jul 28, 2023

Jul 28, 2023

Idk if they made an update since you made your model card for the airoboros 70b ggml model you posted, but gpu acceleration is working fine in oobagooba in this model, llama-2-70b-guanaco-QLoRA-GGML

ThamaluM

Jul 28, 2023

error loading model: llama.cpp: tensor 'layers.0.attention.wk.weight' has wrong shape; expected 8192 x 8192, got 8192 x 1024

Got this error. Loaded using llama-cpp-python in Linux. Python3.11, llama-cpp-python 0.1.77

TheBloke

Owner Jul 28, 2023

@rombodawg thank you, I've updated my READMEs

@ThamaluM this is probably because you've not passed the -gqa 8 parameter. I don't know how you do that with llama-cpp-python, but there must be a way to do it as text-generation-webui works OK with these models and it uses llama-cpp-python. Check the llama-cpp-python repo and see if there's instructions there.

ThamaluM

Jul 28, 2023

Thank for the answer it worked.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment