Q6_K version is broken

#19

by tankstarwar - opened Jan 4

Jan 4

the Q6_K version seems broken, I got rubbish output from this model, not even readable. The Q8_0 version works just fine.

robert1968

Jan 4

No problem with this on my end with mixtral-8x7b-instruct-v0.1.Q6_K.gguf.
i use oobabooga/text-generation-webui on a RTX 3060 12GB and it has about ~2 token/sec response speed.

I use "instruct" in chat. it is important.

tankstarwar

Jan 8

hmm, any chance the model could break during downloading, I did experience some interruption because of the network, usually with the HF-cli I can resume downloading without any issue.

Not a prompting issue I'm sure, the llama.cpp command-line can load the model, but the output is not human-readable at all.

Anyway, thanks for the repo!

siddhesh22

Feb 29

No problem with this on my end with mixtral-8x7b-instruct-v0.1.Q6_K.gguf.
i use oobabooga/text-generation-webui on a RTX 3060 12GB and it has about ~2 token/sec response speed.

I use "instruct" in chat. it is important.

How much system RAM do you have? I have a 3060 12GB too and 16 GB RAM and last I had checked even Q4_M wouldn't work.

deleted

Feb 29

I cant comment much on speed, as it was slow, but 6k works for me, using ooba. ( 64g xeon + 20G GPU )

robert1968

Feb 29

in oobabooga/text-generation-webui there are two options to load this mixtral-8x7b-instruct-v0.1.Q6_K.gguf model:
-Model loader: llama.cpp which is slow (2-3 token/sec) seems the far best local llm can run on my hardware. (see above)
-Model loader: ctransformers which is fast (17-29 token/sec) but this seems not as clever. For example snake.py, generated wtiht this was always failed. i tried many options...

deleted

Feb 29

llama.cpp is what i normally use for gguf. Slow, but reliable.

YaTharThShaRma999

Mar 1

@robert1968 hmm thats interesting, i dont think ctransformers supports mixtral? and ctransformers is usually noticeably slower then llama.cpp as it uses much older versions.
so idk, wht if you are using mistral in ctransformers?

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment