Alikestocode commited on
Commit
022b2da
Β·
1 Parent(s): 33a1d2e

Fix AWQModifier: use quantization_config with num_bits

Browse files

- AWQModifier requires quantization_config parameter
- Create QuantizationConfig with num_bits=4, group_size=128
- Fixes assertion error about num_bits configuration

Files changed (1) hide show
  1. quantize_to_awq_colab.ipynb +13 -4
quantize_to_awq_colab.ipynb CHANGED
@@ -254,10 +254,19 @@
254
  " print(f\" β†’ Starting quantization with LLM Compressor...\")\n",
255
  " print(f\" β†’ This may take 30-60 minutes depending on model size...\")\n",
256
  " \n",
257
- " # AWQModifier API: Use AWQModifier() without parameters\n",
258
- " # The modifier uses default 4-bit AWQ settings\n",
259
- " print(f\" β†’ Creating AWQModifier with default settings...\")\n",
260
- " modifiers = [AWQModifier()]\n",
 
 
 
 
 
 
 
 
 
261
  " print(f\" β†’ AWQModifier created successfully\")\n",
262
  " \n",
263
  " # Call oneshot with the modifier\n",
 
254
  " print(f\" β†’ Starting quantization with LLM Compressor...\")\n",
255
  " print(f\" β†’ This may take 30-60 minutes depending on model size...\")\n",
256
  " \n",
257
+ " # AWQModifier requires quantization_config with num_bits\n",
258
+ " # Create quantization config for 4-bit AWQ\n",
259
+ " from compressed_tensors.quantization import QuantizationConfig\n",
260
+ " \n",
261
+ " print(f\" β†’ Creating quantization config for 4-bit AWQ...\")\n",
262
+ " quant_config = QuantizationConfig(\n",
263
+ " num_bits=4, # 4-bit quantization\n",
264
+ " group_size=128, # Group size\n",
265
+ " zero_point=True # Zero-point quantization\n",
266
+ " )\n",
267
+ " \n",
268
+ " print(f\" β†’ Creating AWQModifier with quantization config...\")\n",
269
+ " modifiers = [AWQModifier(quantization_config=quant_config)]\n",
270
  " print(f\" β†’ AWQModifier created successfully\")\n",
271
  " \n",
272
  " # Call oneshot with the modifier\n",