any possibility of getting a 7b version of this somehow?

#2
by clown134 - opened

im not too sure if its possible, but i think id love to try these mlewd models out, but my 1060 only has 6gb of vram and has a hard time handling such a large model.
if possible id love a 7b version of this or the chat version of this model.
please and thank you!
currently im using MistRP-AirOrca-7B.q5_k_m.gguf and love it

Well, since you're already familiar with GGUF, why not give Undi95/MLewd-v2.4-13B-GGUF a try? I'm pretty sure it performs better than any 7b version of this model.

Hello, I completely forgot this, I can't shrink down the model to 7B because all ressource used was 13B. Sorry!

im not too sure if its possible, but i think id love to try these mlewd models out, but my 1060 only has 6gb of vram and has a hard time handling such a large model.
if possible id love a 7b version of this or the chat version of this model.
please and thank you!
currently im using MistRP-AirOrca-7B.q5_k_m.gguf and love it

Try Undi95's toppy. It's basically mlewd for 7b (decent prose, pretty dirty). You could potentially use a 2k gguf of a 13b too, but mistral 7b models fit sweet into vram because of the sliding window (reduces context size ram use a lot)

thanks. toppy is pretty good. it JUST fits into my vram and response extremely fast, faster than i can type certainly. i have 6gb vram so 13b models are extremely slow, but i could see them being useful if it was for something i needed coherency on

Sign up or log in to comment