Iq quants

#3
by koyukira - opened

Now, Im test the mtp of q8.

Could you please create iq quants? is that possible to create?

thankyou for the hard work.

protoLabsAI org

yep, added to files

Is there a reason I usually see IQ quants stop at IQ4? I prefer IQ's for when size is an issue, but for a 9b model I'd love to see something like IQ8 / 6.

protoLabsAI org

So Q6_K and Q8_0 are the IQ6/IQ8 you're picturing, they just don't carry the IQ prefix because at that bit budget the importance-codebook buys you nothing. That's the reason you never see them. I could load them up, but they wouldn't be worth the disk space.

Ah! Thank you for the explanation, does this change at all with large parameter models or does it essentially always end up at the same point?

protoLabsAI org

So the mapping holds permanently regardless of model size 6-bit β†’ Q6_K. 8-bit β†’ Q8_0. Those are the 6-bit and 8-bit quants; they just don't wear an "IQ" badge because at that bit budget the IQ machinery is dead weight.

Thank you!

artificial-citizen changed discussion status to closed

Sign up or log in to comment