Could We combine AWQ and Importance Matrix calculation together to further improve perplexity.

#4
by shing3232 - opened

Same as the question
Could we do that or it does not matter at all.
Autoawq can calculate AWQ for llama cpp quantization
https://github.com/casper-hansen/AutoAWQ/pull/285
Thanks

What does AutoAWQ do? I can go and look around in the quoted repo, but it would be much easier if someone explained their approach.

If I understand their paper correctly, the scale search is also used in what I do for these quantized models, so not sure combining the two will help.

But I have now contributed the quantization approach used for these models to llama.cpp.
My guess is that it is easier for the contributors of https://github.com/casper-hansen/AutoAWQ/ to try than for me to get up to speed with their repo.

Sign up or log in to comment