Details about this model
#4
by
at676
- opened
Hi, is this the same "1 bit 7B" model that gets better than QuIP# 2 bit in your HQQ+ blog post? If so, isn't this actually a 3 bit model since you use a 16 bit scale for every group of 8 weights which gives an extra 2 bits per weight? The size on disk as reported by HF agrees with me since your model size is around what a 3 bit 7B model would be.
Hi, thanks for your message!
You need to load the model first and see how much vram it uses. The file contains some CPU meta-data as the description says.
I wouldn't recommend using this model anyway, better use the 2-bit HQQ+ which is much better in quality and would use a comparable VRAM as Quip#.