GamerUntouch
/

LLaMa-Storytelling-4Bit

Model card Files Files and versions Community

Seems to generate gibberish for me

by anon8231489123 - opened Apr 8, 2023

Discussion

anon8231489123

Apr 8, 2023

•

edited Apr 8, 2023

Tried running on oobabooga with the following command:
python server.py --model llama-storytelling-13b-4bit-32g --wbits 4 --listen --chat --groupsize 32 -
-settings settings-chatbot.json --model_type llama
Output was this after prompting slightly:
king us throughigskotoS,kc piecesLI Nk obkUS+AcLN Englishitu IT UnW piecestryoAc"]A\Hwo top± otA almostoma click antkie figuresbose shemen N orilakh = terboseThong+ ONUscomm rad Usku live"] nationsuboseposaUsritetrk ONais for agthey MoreTwo thr Alicevik &itthusakhs Wulog minimum amountsOnebisAnt click equ & nearly it now UnderAnt nom< app objectskils Achao absolutely +Longu organizationship San translatedong+aireaisoma standk almost intenosRerov kigsakh # Stand Alice almostsk oruSdr+ pieces &old Englishus ALLPEgs plussk millionIais lsilkth objects PDF+ logbose envi within flash Overk "wkomd lit Clilla Sees Strak schiskoRe+ Billybosakhт SUookbosemenC tot soughttrill + treat positionsSu Kasus hard Stand Twologatel loadsigsOhousblog Washington +at palesucorr occup AntiskoakhgoOus+ Ant Over Alice Vil Englishrieaiscboseyoflashboseaks magn nigs EnglishItboseTigsthomo for CharlieskbbeusiennstandyUsclbosesls + ls Chinaplus MLaireominbosewaiskmons Rat youigsIt Mongo coldcbosek nowThComm+ then- lean theoretical Ast Inigs It Smithking Checkomsils ONbbe

GamerUntouch

Owner Apr 9, 2023

Settings? Seems like a tokenizer problem.

anon8231489123

Apr 10, 2023

Which GPTQ commit did you use?

GamerUntouch

Owner Apr 10, 2023

I think it was 4b7c8bd, but I'm using the latest version on the cuda branch and it's working on my end.

TheBloke

Apr 11, 2023

•

edited Apr 11, 2023

Yeah if a GPTQ model is created with the latest GPTQ-for-LLaMa code, you then have to use that same latest GPTQ-for-LLaMa code in text-generation-webui/repositories otherwise you get gibberish.

I include instructions on how to update text-generation-webui with the latest GPTQ code in my GPTQ models. Example: https://huggingface.co/TheBloke/vicuna-7B-GPTQ-4bit-128g

Alternatively, if the GPTQ is done with the GPTQ-for-LLaMa fork provided by oobabooga (https://github.com/oobabooga/GPTQ-for-LLaMa) then it will work immediately in text-generation-webui. However I found with this older code you cannot use --act-order in the GPTQ settings, as otherwise it again produces gibberish. And without --act-order the inference results may be slightly lower quality (I don't know how much.)

On my Koala repos, eg https://huggingface.co/TheBloke/koala-13B-GPTQ-4bit-128g, I also provided an older GPTQ made with oobabooga's fork. But it takes a lot of time to produce three GPTQ files for every repo!

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment