Add quants for Q5

#2
by dzupin - opened

Hi,
Your quants are best from what is currently on Huggins Space for deepseek-coder-33b.
I just compared your ggml-deepseek-coder-33b-instruct-q4_k_m.gguf with deepseek-coder-33b-instruct.Q4_K_M made by TheBloke on set of my python tests.
Your Q4_K_M models passed with flying colors while the other Q4_K_M failed on almost half of my tests . I expected similar performance but that is not the case.

Would you consider to create quants also for Q5_K_S size? (this should be the largest model that still fit into 24BG of VRAM with 4K context).

@dzupin there you go!

Sign up or log in to comment