vLLM 4Bit?
#1
by
winsonsou
- opened
Hello.. How do we run this in vLLM at 4Bit quant? would we happen to have the command to run? Btw thanks for the model! <3
To my understanding this is quantized using bitsandbytes, for vLLM we need one of these: AWQ, GPTQ.
I look forward to GPTQ too!
I will work on it later today :)