Whisper JAX integration?

#26
by Robis - opened

Is it possible to integrate in this webui also ability of the new Whisper JAX? I just tested how fast it is on their space, and it's crazy ๐Ÿ˜„ 1 minute video in couple seconds.

https://huggingface.co/spaces/sanchit-gandhi/whisper-jax

It's definitely possible by making a new WhisperContainer in the UI, though if you just need a UI to run sanchit-gandhi/whisper-jax there's already a Gradio UI implementation in that repository. And I presume you need to rent a TPU to run it?

But yeah, I did some work on adding support for insanely-fast-whisper in another branch, but I wasn't able to get the same insane speed when combined with a VAD (silero), which is usually needed for proper synchronization for non-English languages such as Japanese.

Not sure if whisper-jax will have the same issue, but for the time being I find faster-whisper fast enough for my use. Though I'd be happy to merge a new Whisper backend if it an improvement over the existing implementations.

Sign up or log in to comment