Add `quantization_config` in config.json?
Hi.
Thanks a lot for uploading this model.
We usually find quantization_config
in config.json if model is quantized. But it is not provided in config.json of this repo. Can you please add quantization_config
in config.json with the items like quant_method
, bits
, and group_size
, etc.? Thanks.
We only support the inference of this model in SGLang. It doesn't need the quantization_config
. You can add it yourself according to quant_method=the corresponding w8a8 int8 method in your inference framework, bits=8, no group_size
.
I understand that you don't need it for SGLang. However, can you please do the HuggingFace community a favor and add the following to config.json if it won't break your workflow in SGLang? Then people can read the quantization_config
for other frameworks without modifying the file themselves. It will be more convenient for the community. We would appreciate if you can do that. Thanks.
"quantization_config": {
"quant_method": "int8",
"bits": 8,
"group_size": -1
}
I will try to add it, but need time to do the verification.
Thanks in advance!