Help. Getting Gibberish Responses?

#1
by deleted - opened
deleted

I'm sorry for asking but I'm a newbie. Using any exllama 2 model is giving me gibberish responses. I'm unsure of what could be causing that. I tried using exllama 2 and exllama 2_HF but neither work for me.

Kooten changed discussion status to closed
Kooten changed discussion status to open
Owner

I assume you are using ooba?
Try updating it in case you have an older version
changing the preset in the parameters tab or in sillytavern or whatever frontend you use, maybe the temp or something is set way to high?

deleted

Sorry, yeah. I'm using Oobabooga. Doesn't seem to be any of those you mentioned. Tried running it on SillyTavern and I don't even get a response. Maybe it's because I'm using an AMD card with Linux.

Owner

That's likely it, I'm not sure how well exllama supports AMD cards, Nvidia is favoured pretty heavily.
Koboldcpp just released support for vulkan, that might work better?

deleted

Oh, Interesting. I'll take a look at it. Thank you. :)

I don't know why, but this exl2 gives me much worse responses than GGUF Q5_K_M and doesn't follow guidelines. Instead of role-playing, it turns out to be story-telling... 🤔

Owner

I loaded it up and it is coherent for me using an up to date ooba, I do not get any runaway generation either.
If you are getting actual gibberish I would make sure your backend, ooba/tabby or whatever you are using is up to date, you could also check your sampler settings, shortwave usually works fine for me.
Regarding the context, are you trying to run 32k context? if so that is probably the issue since I do not think that exllama handles sliding window attention, try 8k instead.

Sign up or log in to comment