Add importance matrixes to the existing and future model cards, as well as general imat info

#9
by Erilaz - opened

That's not a model request, it's a suggestion.

Thanks to ikawrakow's efforts, more novel imat quant formats are becoming available, and we all know that ggerganov's PRs break legacy model files for good sometimes. With your effort rapidly catching up to the TheBloke's contribution back in the day, I can easily imagine how much time it could take to requant all the models to the new standards, and as such, the community could greatly benefit from the imatrixes being published with your cards, should such changes arrive. Because the likelihood of the imat obsolescence is low, and the hardware requirements to produce the thing are high. Since imatrixes themselves are small in comparison with the models, I suspect you store them, and if that's the case, it should be easy enough to append them to the existing model cards. if you have no objections ofc ^_^

Also, it appears that some multilingual models or language-specific fine-tunes degrade in output quality when the calibration data doesn't match the language of the intended use case, but matching imat greatly outperforms both static and erroneous imat quants, especially when the precision is low. I wouldn't ask a colossal effort of making imats ad infinium, but I suggest adding a basic calibration description to the model card info to address it. I assume you have some default calibration data for all models now, but that's not a given, and having some info on the matter in the cards would be nice. I can also imagine some good multilingual models in the future, and if some model performs very well in some language, people could request or even contribute their imats or calibration datasets, turning you great collection of quantized models into a staple amongst the community.

All my imatrix quants have an imatrix.dat in the repo, it's usually the first upload. I don't want to add them to the model card, as the files are usually easy to find if you are looking for them and the model cards are confusing enough.

I indeed have a default way of making the weight matrices, and it's description is a bit scattered over my early models - it is time for a FAQ, but currently, I am still busy with just providing the models. When I don't use the "default" I usually point this out in the model card.

mradermacher changed discussion status to closed

Sign up or log in to comment