Error when running pipe: temp_state buffer is too small

#35

by StefanStroescu - opened Aug 16, 2023

Aug 16, 2023

Hello,

I am trying to use the model to generate me an answer from a context I provide, but when I get to text generation, I get this error: temp_state buffer is too small.
I think it is because my prompt is quite large in terms of tokens, because when I prompt the model without context it works.

I checked and is not an issue of resources, GPU or RAM, and the Llama-2-13B-chat-GPTQ worked when prompted with context.

Does anyone have any suggestions on how to solve this?

Thanks,

Komposter43

Aug 16, 2023

https://github.com/PanQiWei/AutoGPTQ/issues/253

StefanStroescu

Aug 17, 2023

Thanks, Komposter43,

I don't know if it has anything to do with this (https://huggingface.co/TheBloke/Llama-2-70B-chat-GPTQ/discussions/29), but I noticed that the model accepts only inference requests under 2048 tokens.

Komposter43

Aug 25, 2023

fixed in 0.4.2 https://github.com/PanQiWei/AutoGPTQ/releases/tag/v0.4.2

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment