It doesn't work with Exllama at the moment

#1
by Shouyi987 - opened

Probably because of different architecture:

RuntimeError: shape '[1, 74, 64, 128]' is invalid for input of size 75776
Output generated in 0.00 seconds (0.00 tokens/s, 0 tokens, context 75, seed 909967695)

I solved this error by updating to the latest version of the transformers library.

Yes, please update to the latest Transformers Github code to fix compatibility with AutoGPTQ and GPTQ-for-LLaMa. ExLlama won't work yet I believe.

pip3 install git+https://github.com/huggingface/transformers

I have updated the README to reflect this. I should have added it last night, but I didn't get these uploaded until 4am and I forgot.

Sign up or log in to comment