What quantization is it? Can be used with vLLM?
#1
by
rafa9
- opened
Hi! Is this quantized with GPTQ, AWQ or SqueezeLLM? I’m hoping to be able to run this with vLLM templates on runpod.
Thanks!
I generated an AWQ quant around the same time as the exl2 quants:
https://huggingface.co/LoneStriker/Air-Striker-Mixtral-8x7B-Instruct-ZLoss-AWQ
I did not generate a GPTQ though, that needed more resources than I had at the time.