My conversion of a fine-tuned Whisper model to GGML failed

#10
by Maojung - opened

Thanks for the .cpp version. It speeds up my transcoding by 2X to 3X compared to the original OpenAI version. I found a Whisper v2 fine-tuned version for Chinese on https://huggingface.co/jonatasgrosman/whisper-large-zh-cv11, which can provide much better accuracy. After a little tweaking of the ggml conversion Python, I managed to get the new model without errors. However, the model does not work as expected at all. It generates garbage after all. If I switch back to the repository version, it works perfectly. There is no error message during my ggml conversion. I have no idea how to fix it. I can't upload my .bin file here because it's around 3GB. Please help. Here is my tweak of the conversion routine.

Here are the specific changes I made:

git clone https://github.com/openai/whisper
git clone https://github.com/ggerganov/whisper.cpp

clone HF fine-tuned model (this is just an example)

git clone git@hf.co:jonatasgrosman/whisper-large-zh-cv11

convert the model to ggml

python ./whisper.cpp/models/convert-h5-to-ggml.py ./whisper-large-zh-cv11/ ./whisper .

The pytorch_model.bin file in https://huggingface.co/jonatasgrosman/whisper-large-zh-cv11 is 6.17 GB, which is almost double the size of the ggml version in https://github.com/ggerganov/whisper.cpp. Could this be because of the quantization of bits? I am just curious.

Sign up or log in to comment