Cannot run batch on transformer

#20
by DatenlaborBerlin - opened

model = exllama_set_max_input_length(model, 4096)

works in AutoGPTQ but not in transformer, where I get

'LlamaForCausalLM' object has no attribute 'quantize_config'

I am facing the same issue when using the 128g branch

Please open a Github Issue so @fxmarty can have a look at it

https://github.com/huggingface/transformers/issues/26005
I have just added the issue on Github. Please look into it @fxmarty and @TheBloke

Sign up or log in to comment