Spaces:
Sleeping
Sleeping
Commit
Β·
022b2da
1
Parent(s):
33a1d2e
Fix AWQModifier: use quantization_config with num_bits
Browse files- AWQModifier requires quantization_config parameter
- Create QuantizationConfig with num_bits=4, group_size=128
- Fixes assertion error about num_bits configuration
- quantize_to_awq_colab.ipynb +13 -4
quantize_to_awq_colab.ipynb
CHANGED
|
@@ -254,10 +254,19 @@
|
|
| 254 |
" print(f\" β Starting quantization with LLM Compressor...\")\n",
|
| 255 |
" print(f\" β This may take 30-60 minutes depending on model size...\")\n",
|
| 256 |
" \n",
|
| 257 |
-
" # AWQModifier
|
| 258 |
-
" #
|
| 259 |
-
"
|
| 260 |
-
"
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 261 |
" print(f\" β AWQModifier created successfully\")\n",
|
| 262 |
" \n",
|
| 263 |
" # Call oneshot with the modifier\n",
|
|
|
|
| 254 |
" print(f\" β Starting quantization with LLM Compressor...\")\n",
|
| 255 |
" print(f\" β This may take 30-60 minutes depending on model size...\")\n",
|
| 256 |
" \n",
|
| 257 |
+
" # AWQModifier requires quantization_config with num_bits\n",
|
| 258 |
+
" # Create quantization config for 4-bit AWQ\n",
|
| 259 |
+
" from compressed_tensors.quantization import QuantizationConfig\n",
|
| 260 |
+
" \n",
|
| 261 |
+
" print(f\" β Creating quantization config for 4-bit AWQ...\")\n",
|
| 262 |
+
" quant_config = QuantizationConfig(\n",
|
| 263 |
+
" num_bits=4, # 4-bit quantization\n",
|
| 264 |
+
" group_size=128, # Group size\n",
|
| 265 |
+
" zero_point=True # Zero-point quantization\n",
|
| 266 |
+
" )\n",
|
| 267 |
+
" \n",
|
| 268 |
+
" print(f\" β Creating AWQModifier with quantization config...\")\n",
|
| 269 |
+
" modifiers = [AWQModifier(quantization_config=quant_config)]\n",
|
| 270 |
" print(f\" β AWQModifier created successfully\")\n",
|
| 271 |
" \n",
|
| 272 |
" # Call oneshot with the modifier\n",
|