MengniWang
commited on
Commit
•
0b94033
1
Parent(s):
dcb0485
Update README.md
Browse files
README.md
CHANGED
@@ -49,7 +49,7 @@ Install `onnxruntime>=1.16.0` to support [`MatMulFpQ4`](https://github.com/micro
|
|
49 |
|
50 |
### Run Quantization
|
51 |
|
52 |
-
|
53 |
|
54 |
The weight-only quantization cofiguration is as below:
|
55 |
| dtype | group_size | scheme | algorithm |
|
|
|
49 |
|
50 |
### Run Quantization
|
51 |
|
52 |
+
Build [Intel® Neural Compressor](https://github.com/intel/neural-compressor/tree/master) from master branch and run INT4 weight-only quantization.
|
53 |
|
54 |
The weight-only quantization cofiguration is as below:
|
55 |
| dtype | group_size | scheme | algorithm |
|