Intel
/

whisper-small-onnx-int4-inc

Automatic Speech Recognition

Intel® Neural Compressor

neural-compressor

Inference Endpoints

Model card Files Files and versions Community

MengniWang commited on Oct 16, 2023

Commit

dc4aa95

•

1 Parent(s): 98973a5

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -49,7 +49,7 @@ Install `onnxruntime>=1.16.0` to support [`MatMulFpQ4`](https://github.com/micro
 ### Run Quantization
-Run INT4 weight-only quantization with [Intel® Neural Compressor](https://github.com/intel/neural-compressor/tree/master).
 The weight-only quantization cofiguration is as below:
 | dtype | group_size | scheme | algorithm |

 ### Run Quantization
+Build [Intel® Neural Compressor](https://github.com/intel/neural-compressor/tree/master) from master branch and run INT4 weight-only quantization.
 The weight-only quantization cofiguration is as below:
 | dtype | group_size | scheme | algorithm |