owao/LHK_DPO_v1_GGUF · Only q8_0 is working. I think maybe the moe merges have this flaw.

Mar 3

Only q8_0 is working. I think maybe the moe merges have this flaw. Let me know if anyone have any insights regarding this

owao

Owner Mar 3

•

edited Mar 3

I'm personally using the Q5_K_M without any issue inferencing through llama.cpp (TGWUI and Jan).
Which ones did you try and what inference engine did you use?

owao

Owner Mar 3

•

edited Mar 3

I just found Q6_K is broken! It only outputs boxes!
eg. "33A

//EMI

2I,IMA8E#?E?E,(Q IQ.22E3A"

But in the current state, at least those 3 are tested and working fine (and this was unlucky as those were the only ones I tested and kept for myself so I never noticed the issue):

Q4_K_M
Q5_K_S
Q5_K_M

Maybe I messed with 2 scripts or went out of storage when converting this specific one. If required, would be happy to try to requantize as I kept the F16 model on my drive
But that would be really helpful that you let me know the broken ones you tried.

owao

Owner Mar 3

OK, I retested every quants, so really no luck if you only tried the Q6_K before the Q8_0 as this was the only wrong one!
I'm gonna requantize it and reupload if everything goes right this time. I'll let you know ;)

owao

Owner Mar 3

•

edited Mar 3

Q6_K requantized and working as intended this time!
So far, the only one I didn't test is the Q8_0. But with your feedback on this one, we've now covered it all :)

Thanks for pointing out the issue!

I'm closing this.

owao changed discussion status to closed Mar 3