Enabling custom Japanese model

#5
by idoitforthecookies - opened

In the config file I noticed the part about uncommenting the section to enable custom Japanese models. I tried uncommenting it but it let me down a rabbithole of errors starting with

ValueError: Invalid model size 'vumichien/whisper-large-v2-mix-jp', expected one of: tiny.en, tiny, base.en, base, small.en, small, medium.en, medium, large-v1, large-v2

and as I tried fixing each one new ones just kept popping up. So my question is, is this feature not implemented yet or am I missing something obvious?

Thanks in advance!

There's currently no support for automatic conversion to the model type (CTranslate2) used by faster-whisper, but you can do this manually using the CLI.

For instance, to convert vumichien/whisper-large-v2-mix-jp to CTranslate2, first download the repository locally:

git lfs install
git clone git clone https://huggingface.co/vumichien/whisper-large-v2-mix-jp/tree/main

Then add the tokenizer.json file from openai/whisper-large-v2, as it appears to be missing in whisper-large-v2-mix-jp:

cd whisper-large-v2-mix-jp
wget https://huggingface.co/openai/whisper-large-v2/raw/main/tokenizer.json

Then, use the ct2-transformers-converter CLI (as explained in the README of faster-whisper) to convert the model in whisper-large-v2-mix-jp to CTranslate2 (might want to use Anaconda or a Virtual Python environment here):

cd ..
pip install transformers[torch]>=4.23
ct2-transformers-converter --model ./whisper-large-v2-mix-jp --output_dir whisper-large-v2-ct2  --copy_files tokenizer.json --quantization float16

That should produce the converted model in the directory whisper-large-v2-mix-jp:

image.png

You can now reference this directory directly if you use a type other than "whisper", here I just set it to "filesystem" to disable any conversion. I also used a Windows-style path, but you can use Unix paths if you are on Linux/Mac):

{
    "name": "whisper-large-v2-mix-jp" ,
    "url": "J:\\Dev\\Models\\whisper\\whisper-large-v2-mix-jp-ct2",
    "type": "filesystem",
}

But yeah, ideally faster-whisper ought to be able to do this conversion automatically. It just hasn't been a priority, as I find "large-v2" to be more than good enough for my use.

Thanks a lot, I got it to work!
I agree in general the large-v2 model is very good. Just some topics sometimes translate a bit weird. Was mostly curious how much better this specialized set actually is.
Again thanks for the help on this and thanks in general for the project. Its been a big help!

idoitforthecookies changed discussion status to closed

Sign up or log in to comment