openai/whisper-large-v2 · Reccuring problems on pipe

Mar 8, 2023

Hi everyone,
As I am trying to use whisper for transcription of multi-language, I have see some problmems using the pipeline method.
First, sometimes the language spoken is in french, but the outputs look like translation of the initial text.
Then, i sometimes have repetition of phrases, sometimes one time, sometime several times.
My code is quite simple on that part :
pipe = pipeline(
"automatic-speech-recognition",
model="openai/whisper-tiny",
chunk_length_s=30,
)
pipe(audio, return_timestamps=True)

Any idea why I got that?

remcbdx

Mar 8, 2023

Regarding the translation problem, I found https://huggingface.co/openai/whisper-large-v2/discussions/20
Difference with me is that it happen also on my local machine (m2).
The thing is, if I just want the initial language, and no translation, should I specify something?
And if I have multi language audio files, should i choose on language?

ArthurZ

Mar 16, 2023

You should use the latest release! This should have been fixed!

sanchit-gandhi

Mar 17, 2023

•

edited Mar 17, 2023

Indeed! @ArthurZ 's done a big job at improving the Whisper API in transformers Updating to the latest version should get you these upgrades:

pip install --upgrade transformers

If you know the language a-priori, you can pass it as follows:

pipe(audio, return_timestamps=True, generate_kwargs={"language": "french"}

Likewise, you can specify the task as translate/transcribe:

pipe(audio, return_timestamps=True, generate_kwargs={"language": "french", "task": "transcribe"}

Let us know if you have any other questions, more than happy to help!