[Question] encoder-openvino-int4

by jungsanghyun - opened about 1 month ago

Discussion

jungsanghyun

about 1 month ago

Is there any script or explanation to make quantized whisper-encoder-openvino-int4 model?

mukowaty

Owner 29 days ago

Hi. I used the convert-whisper-to-openvino.py script from this PR (not merged): https://github.com/ggerganov/whisper.cpp/pull/2184.
At least on my computer, INT4 are noticeably faster on CPU than regular ones. The disadvantage is that they cannot be used on the GPU.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Your need to confirm your account before you can post a new comment.

· Sign up or log in to comment