4-bit quantization

#2
by ibalampanis - opened

Hello!

How you managed to quantize it to Q4_K _M? llama.cpp offers only q8_0, f16 and f32, right?

Thanks.

Sign up or log in to comment