Support for quantizing whisper model?

#93
by lesserfield - opened

I was wondering if it would be possible to add support for Whisper.cpp? Thanks!

ggml.ai org

Hey! @lesserfield - That's a good question - I think we can perhaps pre-quantise all of the checkpoints and put them in the organisation somewhere. What do you think?

Hey! @lesserfield - That's a good question - I think we can perhaps pre-quantise all of the checkpoints and put them in the organisation somewhere. What do you think?

That would be helpful, I'm looking forward to it.

ggml.ai org

FYI, whisper.cpp does not support gguf format atm, so maybe it requires more works.

Ref: https://github.com/ggerganov/whisper.cpp/blob/bf4cb4abad4e35c74b387df034cc4ac7b22e5fe6/whisper.cpp#L1332

ggml.ai org

Sorry, has been a busy couple of days, getting back to this now:

@ngxson - I agree and I think that's why it makes sense to have it seperately as a repo in the GGML org, my plan is to pre-quantize and upload all the major whisper.cpp quants:
https://github.com/ggerganov/whisper.cpp?tab=readme-ov-file#quantization

WDYT?

ggml.ai org

There are already quite a lot of pre-quantised quants here btw: https://huggingface.co/ggerganov/whisper.cpp/tree/main

lesserfield changed discussion status to closed

Sign up or log in to comment