vLLM 4Bit?

by winsonsou - opened Apr 18, 2024

Apr 18, 2024

Hello.. How do we run this in vLLM at 4Bit quant? would we happen to have the command to run? Btw thanks for the model! <3

Apr 18, 2024

To my understanding this is quantized using bitsandbytes, for vLLM we need one of these: AWQ, GPTQ.
I look forward to GPTQ too!

Unofficial Mistral Community org Apr 18, 2024

I will work on it later today :)

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment