Measurements.json next time please!

#1
by ProphetOfBostrom - opened

The measurements.json file would make it faster for me and anyone else to make another quant of this model using the same settings as you but a different BPW.

Like bartowski does here.

Your repo of course and it's not difficult to regenerate, but just a thought.
ramble:
I personally don't think I'll actually -need- them this time, I seem to have good luck loading exl2's quantization program, but it seems a shame that you do most of the computational work for every size of quant. Personally I've never actually heard of a 32 gigabyte GPU. So as a 3090 peasant, I can't use your quant - which is fine. God knows uploading these large files can be a nightmare. But I'll have to do exactly the same calculations as you to generate a 3.8 BPW or whatever, with only a bit of combinatorial stuff at the end and the actual integer rounding being different. Measurement files are just a few megabytes.

It's most important because in some cases/vram sizes you can feasibly quantize but not calibrate an exl2 that you could then run yourself. In those cases, you're essentially letting people with less VRAM quantize as though they had as much as you. I think in principle you require almost no video memory at all just to generate a quant.

Sign up or log in to comment