Add `quantization_config` in config.json?

#4
by WeiwenXia - opened

Hi.
Thanks a lot for uploading this model.
We usually find quantization_config in config.json if model is quantized. But it is not provided in config.json of this repo. Can you please add quantization_config in config.json with the items like quant_method, bits, and group_size, etc.? Thanks.

We only support the inference of this model in SGLang. It doesn't need the quantization_config. You can add it yourself according to quant_method=the corresponding w8a8 int8 method in your inference framework, bits=8, no group_size.

I understand that you don't need it for SGLang. However, can you please do the HuggingFace community a favor and add the following to config.json if it won't break your workflow in SGLang? Then people can read the quantization_config for other frameworks without modifying the file themselves. It will be more convenient for the community. We would appreciate if you can do that. Thanks.

  "quantization_config": {
    "quant_method": "int8",
    "bits": 8,
    "group_size": -1
  }

I will try to add it, but need time to do the verification.

Thanks in advance!

Sign up or log in to comment