[Question] encoder-openvino-int4
#1
by
jungsanghyun
- opened
Is there any script or explanation to make quantized whisper-encoder-openvino-int4 model?
Hi. I used the convert-whisper-to-openvino.py
script from this PR (not merged): https://github.com/ggerganov/whisper.cpp/pull/2184.
At least on my computer, INT4
are noticeably faster on CPU than regular ones. The disadvantage is that they cannot be used on the GPU.