Error when running generate in text generation ui

#7
by droidriz - opened

hi i am getting "AttributeError: 'LlamaForCausalLM' object has no attribute 'generate_with_streaming'" error with following run

python server.py --model MetaIX_GPT4-X-Alpasta-30b-4bit --model_type llama --wbits 4 --groupsize 128 --auto-devices
INFO:Loading MetaIX_GPT4-X-Alpasta-30b-4bit...
INFO:Found the following quantized model: models/MetaIX_GPT4-X-Alpasta-30b-4bit/gpt4-x-alpasta-30b-128g-4bit.safetensors
INFO:Loaded the model in 29.45 seconds.

--no-stream gives this error
raise ValueError(
ValueError: The following model_kwargs are not used by the model: ['context', 'token_count'] (note: typos in the generate arguments will also show up in this list)

Hi I am getting the same error when trying to run the model with oobabooga webui

ValueError: The following `model_kwargs` are not used by the model: ['context', 'token_count'] (note: typos in the generate arguments will also show up in this list)
Output generated in 0.01 seconds (0.00 tokens/s, 0 tokens, context 36, seed 16471360)```

Same error for me with the MetaIX_OpenAssistant-Llama-30b-4bit model

Ok, so for the other model I'm using I got it to work again. I downgraded the transformers_version in "config.json" to "4.28.0.dev0" based on another working model. Then i got the error that not all tensors are on the same device, so i disabled autodevice, cpu and disk settings in textgeneration webui. Just a quick fix.

Sign up or log in to comment