Q1 quantization?

by Snowy4901 - opened Jan 31

Jan 31

I have an old GTX 1650 with only 4GB of VRAM. The Q2_K with 1.8GB is a smidge to large and ComfyUi decides to load it into RAM instead.

Would you be so kind and upload a Q1_K version? If even possible. I know that the quality will be dramatically worse, but it's better than nothing I guess.

My guess is, that you have the process of creating GGUFs kind of automated. I hope it's not a hassle for you :)

Thank you in advance! And thanks for all your work with the LLMs!

Snowy4901

9 days ago

I'm still hopeful someone from Unsloth will read this

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment