Add `quantization_config` in config.json?

by WeiwenXia - opened 6 days ago

6 days ago

•

Hi.
Thanks a lot for uploading this model.
We usually find quantization_config in config.json if model is quantized. But it is not provided in config.json of this repo. Can you please add quantization_config in config.json with the items like quant_method, bits, and group_size, etc.? Thanks.

HandH1998

6 days ago

We only support the inference of this model in SGLang. It doesn't need the quantization_config. You can add it yourself according to quant_method=the corresponding w8a8 int8 method in your inference framework, bits=8, no group_size.

WeiwenXia

6 days ago

I understand that you don't need it for SGLang. However, can you please do the HuggingFace community a favor and add the following to config.json if it won't break your workflow in SGLang? Then people can read the quantization_config for other frameworks without modifying the file themselves. It will be more convenient for the community. We would appreciate if you can do that. Thanks.

  "quantization_config": {
    "quant_method": "int8",
    "bits": 8,
    "group_size": -1
  }

HandH1998

5 days ago

I will try to add it, but need time to do the verification.

WeiwenXia

5 days ago

Thanks in advance!

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment