Is the Q8_0 quant also imatrix'd? Why?

#1
by igzbar - opened

What was the basis of the decision to use imatrix vs. regular quantization for Q8_0? Doesn't imatrix reduce performance?

It shouldn't reduce performance (unless you have a source on that) but it also should not affect it much if at all, since at Q8 there's no need to compress portions further than others

Sign up or log in to comment