output is merely copy of input for 70b @ webui
#13
by
wholehope
- opened
Can anybody enlighten me how to inference 70b-GPTQ model (chat or non-chat) using oobabooga/text-generation-webui? No matter I use the LLaMa-v2 instruct mentioned on the model card or just plain prompt, the output is always the exact copy of input. In the same webui, I can inference 13b/7b-GPTQ (chat or non-chat) without any problem.
Also have issues in textgen webui, no tokens generated, only in the chat interface the other one works.