Quantisation parameters + Q5_K_M version?

#1
by smcleod - opened

Howdy!

Two questions, the first is - any chance you'd be able to share the quantisation commands / parameters used to create these?

As I think you've seen I've had a crack at doing it myself, but your models consistently have more reliable output so I'm keen to learn the difference in methods used.

The second - Any chance of a Q5_K_M (with iMatrix) version?

Hey there, I remember seeing one of your comment (don't recall where it was though) and the commands you had were exactly the same as mine so I think maybe the issue is somewhere else. I use a CUDA build (always latest code from master) on a 3090 and I'm running on Windows so maybe your setup is different.

And I just added Q5_K_M.

Interesting!

Thanks for the Q5 :)

smcleod changed discussion status to closed

Sign up or log in to comment