any plan to release quantization that works with llama.cpp

#3
by ziyadalkhonein - opened

any plan to release quantization that works with llama.cpp? you know not lot of people have V100 or A100

You can run it on oobabooga in 4bit which will take less vram

Sign up or log in to comment