mradermacher/model_requests · This discussion has been hidden

deleted

Dec 26, 2024

This comment has been hidden

Owner Dec 27, 2024

Can't say I am thrilled to download and convert these big models just for one quant, but you definitely earned it :) They should be there already.

mradermacher changed discussion status to closed Dec 27, 2024

deleted

Dec 28, 2024

This comment has been hidden

deleted

Dec 28, 2024

This comment has been hidden

mradermacher

Owner Dec 28, 2024

•

edited Dec 28, 2024

usually Q4_1 has lower perplexity than Q4_0 on every value.

Well, we've been telling you: Q4_1 is well known to be very unstable. That was one of the reasons it was abondoned: it is often larger and worse than Q4_0. The reason I added it was because you convinced me of the usefulness in certain situations (speed with metal). This is just a data point that the problems were not fixed in recent versions.

The Q4_1 quant was done with the same imatrix, but using the current version of llama.cpp.

deleted

Dec 28, 2024

This comment has been hidden

deleted

Dec 28, 2024

This comment has been hidden

nicoboss

Dec 28, 2024

•

edited Dec 28, 2024

We will have to accept that Q4_1 quants have the potential to turn out worse than Q4_0 quants. For static quants my measurements so far even indicate that this usually is the case. Now while for wighted/imatrix quants Q4_1 is usually much better than Q4_0 you cannot relay on imatrix training working that well for every model. There are just some models/architectures that see less improvement from wighted/imatrix quants in which case Q4_0 might still beat Q4_1 however I would expect such outliers to be quite rare and I would expect such cases to mainly happen for non-English models due to ouer imatrix dataset beeing English focused.

In this specific case I wouldn't count too much on perplexity measurements. They are one of the worst measurements llama.cpp gives you especially when comparing quants of almost the same quality. Instead use KL-divergence and token probability measurments and see if they lead you to the same conclusion.

deleted

Dec 29, 2024

This comment has been hidden

deleted

Jan 26

This comment has been hidden

deleted changed discussion title from Q4_1 requests to This discussion has been hidden May 24