Could you provide 4bit quantization model weights?

#6
by eeyrw - opened

The 32bit weights is too big to download. If you can provide 4bit quantization version like https://huggingface.co/baichuan-inc/Baichuan2-7B-Chat-4bits it will be very convenient for model download and loading to GPU.

I suppose you could try PsiPi/liuhaotian_llava-v1.5-13b-GGUF until they do.

Sign up or log in to comment