Iq quants

by koyukira - opened 5 days ago

Discussion

koyukira

5 days ago

Now, Im test the mtp of q8.

Could you please create iq quants? is that possible to create?

thankyou for the hard work.

artificial-citizen

protoLabsAI org 5 days ago

yep, added to files

bleaki

3 days ago

Is there a reason I usually see IQ quants stop at IQ4? I prefer IQ's for when size is an issue, but for a 9b model I'd love to see something like IQ8 / 6.

artificial-citizen

protoLabsAI org 3 days ago

So Q6_K and Q8_0 are the IQ6/IQ8 you're picturing, they just don't carry the IQ prefix because at that bit budget the importance-codebook buys you nothing. That's the reason you never see them. I could load them up, but they wouldn't be worth the disk space.

bleaki

3 days ago

Ah! Thank you for the explanation, does this change at all with large parameter models or does it essentially always end up at the same point?

artificial-citizen

protoLabsAI org 3 days ago

So the mapping holds permanently regardless of model size 6-bit → Q6_K. 8-bit → Q8_0. Those are the 6-bit and 8-bit quants; they just don't wear an "IQ" badge because at that bit budget the IQ machinery is dead weight.

bleaki

3 days ago

Thank you!

artificial-citizen changed discussion status to closed 3 days ago

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment