3bit quantization

#3
by nbzj - opened

Is it possible to apply 3bit quantization? I think it might fit 24gb vram consumer gpu.

Yes I hope it will and I plan to do that soon

The only reason I've not done it yet is because Falcon40B is so slow atm when quantised, so it's not hugely useful

But I'll do a 3bit anyway and hopefully the slowdown issue might be improved upon soon

Sign up or log in to comment