Panchovix
/

goliath-120b-exl2-rpcal

Model card Files Files and versions Community

Quantitative command?

by akoyaki - opened Dec 23, 2023

Discussion

akoyaki

Dec 23, 2023

•

edited Dec 23, 2023

Hi, I've been waiting for a long time to try the new quantized version of exl2-2(The presentation on reddit says that 2.4bpw is better than 4.5bpw/Q4KS) , and decided to try it on my own not long ago, but the same 3bpw+pippa calibration dataset test is very confusing (3.8 for your 4.5bpw, 8.17 for the 3bpw, and 13.4 for my 3bpw quantized results)
I used "python convert.py -i /path -o /out_path -c pippa_raw_fix.parquet -b 3 -hb 6"
Can you share your quantization command please? Do I need any other parameters to improve the quality?

Panchovix

Owner Dec 23, 2023

I use a pretty similar command, but I just don't specify -hb (are you going for headsize in this one?

py .\convert.py -i 'Goliath-120B-path' -o 'Goliath-120B-3bpw-path' -c pippa_raw_fix.parquet -b 3

akoyaki

Dec 23, 2023

Well I try it twice with and with out -hb (default is 6 if not set)
-hb 3 give 13.8
-hb 6 give 13.4
Maybe just because new quant metlod not good for big model like 100~120B?
I'll try it again with older version exllamav2 to use old metlod to figure out
Thanks for the share

akoyaki

Dec 23, 2023

This comment has been hidden

akoyaki

Dec 24, 2023

Well, it looks like 3bpw rpcal it's just simply randomly crash with no reason
2.9bpw exl2-2 rpcal 5.13
2.4bpw exl2-2 5.96
2.4bpw exl2-2 rpcal 6.68
3bpw exl2-2 rpcal 13.8
I'm getting 11~15 t/s with 2.9bpw+sd on 2x3090, feels good

Panchovix

Owner Dec 24, 2023

Pretty interesting results, but as you keep testing you will notice PPL is not everything, but that you like the model itself.

Glad quanting worked for you.

Panchovix changed discussion status to closed Dec 24, 2023

akoyaki

Dec 25, 2023

haha I know that, I'm not try to get best score, just want to know why the scores is diffirent
And yes I'm very agree with you, PPL is not everything, and benchmark also, I like use new model few hours to feel it but not test the score

Panchovix

Owner Jan 6

I've uploaded new quants in any case, if you want to try. I also suggest if you do, to backup your existing quants.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment