Generation text is wrong

#18
by OveJie - opened

Input is

[INST] <<SYS>>
Answer the questions.
<</SYS>>
Input [/INST]who are you?

output is

[INST] <<SYS>>
Answer the questions.
<</SYS>>
Input [/INST]who are you? Kirchengenтуриödynastattrohibitionergan statute appointmentsocalringbone curiosity nasalysoftiratovisalloccoKitten hierstrapbourgansteksreetamericannecttuplingahr Jimmyattungedad prevamps Dum Sw zero internshipsplain Wardensikurile trokingtons lucionariororneurvoyaz pitfalls Babieseiмого nest Tennis noscht rememberidosлін mindsShiftichi prof▸nab PaleAUargetingtons corporatezzaheit scratchespirelesslyembergeniblematicamentewebdriverunas向 Lad心yme message apartilleryokoazonertanest abstra knockbucketebol Sulflipido conductivity論FE Ribieroунidal fine tunalet pipespecern arbitraryelles draft Waldorfunto dinner典 GovernisseurToggle conjunctionalitiesouglasmicudem virtuallyommenatraDF Bernardinewsyerollarsrebкт spreadsheets integrity Fredericalecuoire Aerqualivalentciu Ayiakestampdasiskitainewyachtillettechniquesxc mantzar journalismNBapiDoc

The text is gibberish. how to solve this?

OveJie changed discussion status to closed

Are you using AutoGPTQ + a model with group_size + desc_act together?

If so there is a bug in AutoGPTQ which was fixed yesterday but not released yet.

What are you using to do inference? What UI? If you are using text-generation-webui then I recommend you use ExLlama instead, which is faster and doesn't have this issue.

If you are wanting to use AutoGPTQ from Python code then you have three options:

  1. Use a model that has either group_size or desc_act, not both together, until AutoGPTQ 0.3.1 is released (hopefully today or tomorrow).

  2. Downgrade to AutoGPTQ 0.2.2 and use that until 0.3.1 is released:

pip3 uninstall -y auto-gptq
GITHUB_ACTIONS=true CUDA_VERSION="" pip3 install auto-gptq==0.2.2
  1. Install AutoGPTQ 0.3.1 from source, with:
pip3 uninstall -y auto-gptq
git clone https://github.com/PanQiWei/AutoGPTQ
cd AutoGPTQ
pip3 install -v .

Thank you so much, I just found another discussion where the same issue was mentioned. https://huggingface.co/TheBloke/Llama-2-13B-chat-GPTQ/discussions/15 I used ExLlama to load the model, and now it works fine.

OK great

FYI AutoGPTQ 0.3.1 just released which fixes this issue. So that can be used OK now.

But ExLlama is much quicker, so that is the recommendation to use when possible.

Sign up or log in to comment