This gives gibberish nonsense in text-generation-server

#2
by Websteria - opened

I have not been able to get the quantized version to make any sense. The HF version works great, just a bit slow.

Websteria changed discussion title from This gives gibberish in text-generation-server to This gives gibberish nonsense in text-generation-server

Please read the README.md. You either need to update text-generation-webui's GPTQ-for-LLaMa to the latest version, or else use file koala-13B-4bit-128g.no-act-order.ooba.pt

I am up to date with the latest files for text-generation-webui and GPTQ-for-LLaMa and I can confirm I get gibberish as well on the 7B and 13B quantized versions

When you say you're up-to-date, are you sure you're using the right GPTQ-for-LLaMa version? It needs to be the qwopqwop repo, not the oobabooga fork.

If the update to GPTQ-for-LLaMa is not working for you, just use koala-13B-4bit-128g.no-act-order.ooba.pt. Remove any other pt/safetensors files from your model directory, such that you just have koala-13B-4bit-128g.no-act-order.ooba.pt and that will work with any version of GPTQ-for-LLaMa

This fixes my problem. Thank you!!!

Websteria changed discussion status to closed

Sign up or log in to comment