core dumps when attempting falcon-11B-Q6_K.gguf

#1
by LaferriereJC - opened

latest text-generation-webui

pip list installed | grep llama
exllamav2 0.0.20+cu121
gptq-for-llama 0.1.1+cu121
llama_cpp_python 0.2.75+cpuavx2
llama_cpp_python_cuda 0.2.75+cu121
llama_cpp_python_cuda_tensorcores 0.2.75+cu121

flash-attn 2.5.6

python 3.10
p5200 (compute compatibility 6)
rocky linux 9

attempted with both flash-attn and without

Guessing this is a text-gen-webui problem sadly, as using llama.cpp ./main loads without issue :') though the output was garbage, likely because I just didn't format the prompt properly

Can you try manually bumping your llama_cpp_python to 0.2.76?

Sign up or log in to comment