openai/whisper-medium used in pipeline automatically translates to english

#17
by private-langsource - opened

I am trying to produce a transcription for audio file downloaded from YouTube.
The resulting transcription is mostly an English translation of what the audio was about.

Using the following code:

import torch
from transformers import pipeline

audio_input = 'XXXX.m4a'

device = "cuda:0" if torch.cuda.is_available() else "cpu"

pipe = pipeline(
"automatic-speech-recognition",
model="openai/whisper-medium",
chunk_length_s=30,
device=device,
generate_kwargs = {"task":"transcribe", "language":"<|pl|>"}
)

prediction = pipe(audio_input, batch_size=8, return_timestamps=False)

I am having exactly the same problem.
Isn't there any solution for this problem yet ?

Hi,
I was trying adding task='transcribe' and language attributes directly, which did not work.
But then using the workaround solution of 'generate_kwargs' solved my problem.
Thanks for the tip.

Could you share an audio sample for this behaviour please? The code looks correct! You can also specify the language in string form if you're on the latest version of transformers:

pip install -U transformers

Sign up or log in to comment