here is a google colab with 1.1 support

#3
by eucdee - opened

Very cool!

@eucdee thanks for sharing. You can try the triton branch of GPTQ-for-LLaMa for (hopefully) better perfs now, the code has been fixed (remove "-b cuda").

@eucdee thanks for sharing. You can try the triton branch of GPTQ-for-LLaMa for (hopefully) better perfs now, the code has been fixed (remove "-b cuda").

Oh really? Qwopqwop's latest commits means it works in text-generation-webui again? If so that's great news.

@TheBloke , there is a little bug still: https://github.com/oobabooga/text-generation-webui/issues/1343#issuecomment-1513070072

So you need to edit text-generation-webui/modules/GPTQ_loader.py:

#from modelutils import find_layers
from utils import find_layers

Tested with:

  • oobabooga/text-generation-webui commit 9d9ae6293833ce31bbb5ed5d9a04b033d1e3896d
  • qwopqwop200/GPTQ-for-LLaMa commit d89cdcd8b53f61346290a28d326816af6a028434

Great, thanks for the details!

@eucdee I added a link to your Colab in the README. Thanks for providing it!

TheBloke changed discussion status to closed

Sign up or log in to comment