I get worse answers than the original model.

#3
by JeisonJA - opened

I was testing the original model for a specific case, and got very good results. But with the model supplied in this space in GGUF format the answers are not so accurate. Maybe I'm not selecting the right model, which one do you recommend me to use, I tried sqlcoder.Q3_K_M.gguf.

i never tried a Q3 model because of the reported quality loss and the model card states that.

sqlcoder.Q3_K_M.gguf | very small, high quality loss

try any of these

sqlcoder.Q5_K_M.gguf | large, very low quality loss - recommended
sqlcoder.Q6_K.gguf | very large, extremely low quality loss
sqlcoder.Q8_0.gguf | very large, extremely low quality loss - not recommended

I tried qlora custom model derived llama 27b and it was consistent in terms of accuracy and perplexity but when i quantized that model to Q5_K_M.gguf perplexity maintained but accuracy suffered

Sign up or log in to comment