Q1 quantization?

#2
by Snowy4901 - opened

I have an old GTX 1650 with only 4GB of VRAM. The Q2_K with 1.8GB is a smidge to large and ComfyUi decides to load it into RAM instead.

Would you be so kind and upload a Q1_K version? If even possible. I know that the quality will be dramatically worse, but it's better than nothing I guess.

My guess is, that you have the process of creating GGUFs kind of automated. I hope it's not a hassle for you :)

Thank you in advance! And thanks for all your work with the LLMs!

I'm still hopeful someone from Unsloth will read this

Sign up or log in to comment