openai/whisper-large-v3 · Transcribing multiple languages in single audio file

Dec 15, 2023

What about recordings of people speaking in multiple languages like in pod casts, some utterances are in English even when majority is some other language? If I choose language as spoken language then transcription is better than if no language tag is provided or English was chosen as language, however, when audio reaches the part where they speak English then it becomes troublesome for model to transcribe properly unless English was chosen in first place but that would do badly while transcribing other languages.

What is the best way to transcribe? Should I use pipeline and with what kwargs, like word level time stamps or no time stamps, and should I choose language in start or do what, please guide.

iadithyan

Dec 28, 2023

Hi,

Have the same problem. Did you figure this out :) ?

Best,
Adi

supercharge19

Dec 29, 2023

Not exactly a solution but it worked best:

set language to auto and then put translate as task.

lalok

Mar 27, 2024

•

edited Mar 27, 2024

any updates with this? I am trying to make a transcription of an audiofile containing Korean, Japanese and English being Korean the main language