TheBloke/SauerkrautLM-Mixtral-8x7B-Instruct-AWQ · AWQ model performs significantly worse than the GPTQ model

I had a discussion on the original model card page about issues I was having prompting this model.
https://huggingface.co/VAGOsolutions/SauerkrautLM-Mixtral-8x7B-Instruct/discussions/2

After many different tests we came to the conclusion that the issue was because I was using the AWQ quantization.
A lot of the time the issue was specifically that the model had a large tendency to continue generating more text after already generating the requested information.
If anyone else is having similar issues know that it is likely the quantization and not the model itself.

I don't know if this behavior is specific to this model or if it is a general case for most AWQ quantizations, if anyone knows I would be intrigued to know!