[Question] encoder-openvino-int4

#1
by jungsanghyun - opened

Is there any script or explanation to make quantized whisper-encoder-openvino-int4 model?

Hi. I used the convert-whisper-to-openvino.py script from this PR (not merged): https://github.com/ggerganov/whisper.cpp/pull/2184.
At least on my computer, INT4 are noticeably faster on CPU than regular ones. The disadvantage is that they cannot be used on the GPU.

Your need to confirm your account before you can post a new comment.

Sign up or log in to comment