Why is this exactly the same size as the 8-bit one?

by dagelf - opened Apr 30

Discussion

dagelf

Apr 30

I'm guessing there is a mistake...?

Xingyu-Zheng

LLM-Quantization org Apr 30

In our research, to quickly validate the effectiveness of various quantization methods, we only performed fake-quant on smoothquant without storing in real 4-bit format. Therefore, the size of the checkpoints we saved and uploaded is actually equivalent to the fp16 model.🪄
We will continue to improve our work to achieve as realistic quantization testing as possible with software and hardware support. More work is on the way!🤗

dagelf

Apr 30

That makes sense, I guess I could've looked :-D Thank you for the clarification!

dagelf changed discussion status to closed Apr 30

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment