3bit quantization
#3
by
nbzj
- opened
Is it possible to apply 3bit quantization? I think it might fit 24gb vram consumer gpu.
Yes I hope it will and I plan to do that soon
The only reason I've not done it yet is because Falcon40B is so slow atm when quantised, so it's not hugely useful
But I'll do a 3bit anyway and hopefully the slowdown issue might be improved upon soon