Introducing AutoRound INT4 algorithm

#4
by wenhuach - opened

Hello,
First and foremost, I would like to express my gratitude for your exceptional work and for sharing your model with the community. We have recently applied AutoRound to your model, achieving better results fore int4 model. Below are the accuracies, all tested with real quantized models in the same environment and zero shot tasks.

Metric BF16 01-ai/Yi-6B-Chat-4bits INT4
Avg. 0.6043 0.5867 0.5939
mmlu 0.6163 0.6133 0.6119
cmmlu 0.7431 0.7312 0.7314
ceval 0.7355 0.7155 0.7281
gsm8k 0.3222 0.2866 0.3040

Unfortunately, we are unable to upload the quantized model due to licensing constraints. Therefore, we would appreciate it if you could generate it yourself by following the recipe links, and we are here to provide assistance. Additionally, we would greatly appreciate it if you could consider using our method to generate quantized models for your new models in the future.

cool, may I ask what the license restriction specifically refers to? The quantitative model for our model allows self uploading to HF. In addition, you can submit a PR to the Ecosystem section of our repo, and we can recommend your model.

Hi,
As our company has very strict legal review and this process typically takes a long time, so we could not uploaded it, at least for now. So we would appreciate that you could try and generate it yourself.
Besides, we have supported local data and combination of different datasets for calibration, e.g, --dataset "./tmp.json,NeelNanda/pile-10k" . Using you own train data for calibration may lead to better accuracy. A soft reminder, we drop the samples < args.seqlen for now

Got it. I will use your method to generate a quantized model and upload it!

Thank you for your kind understanding

wenhuach changed discussion status to closed

Sign up or log in to comment