Details about this model

#4
by at676 - opened

Hi, is this the same "1 bit 7B" model that gets better than QuIP# 2 bit in your HQQ+ blog post? If so, isn't this actually a 3 bit model since you use a 16 bit scale for every group of 8 weights which gives an extra 2 bits per weight? The size on disk as reported by HF agrees with me since your model size is around what a 3 bit 7B model would be.

Mobius Labs GmbH org

Hi, thanks for your message!
You need to load the model first and see how much vram it uses. The file contains some CPU meta-data as the description says.
I wouldn't recommend using this model anyway, better use the 2-bit HQQ+ which is much better in quality and would use a comparable VRAM as Quip#.

Sign up or log in to comment