Details about this model

by at676 - opened May 12

May 12

Hi, is this the same "1 bit 7B" model that gets better than QuIP# 2 bit in your HQQ+ blog post? If so, isn't this actually a 3 bit model since you use a 16 bit scale for every group of 8 weights which gives an extra 2 bits per weight? The size on disk as reported by HF agrees with me since your model size is around what a 3 bit 7B model would be.

mobicham

Mobius Labs GmbH org May 13

Hi, thanks for your message!
You need to load the model first and see how much vram it uses. The file contains some CPU meta-data as the description says.
I wouldn't recommend using this model anyway, better use the 2-bit HQQ+ which is much better in quality and would use a comparable VRAM as Quip#.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment