Can you quantize one for 24GB cards too?

#1
by CamiloMM - opened

Was gonna finally learn how to do it but hey if you're already doing it πŸ’¦

Much thanks πŸ™ fantastic model, honestly quite impressive in the GGUF.

I'm going to guess that's ~5.5 to 6 bpw? If so, sure!

Yes please!

(...I actually have no idea, I imagine ~5.5 to 6 bpw is fine since 24GB runs 70b at 2.3bpw fine, which... would mean even 8bpw is fine for a 20b? (not sure if it's as simple as parameters * bpw * x = VRAM)
Though that's probably pushing it and even 5bpw is already diminishing returns, I imagine. I'd make it 6bpw.)

Sign up or log in to comment