Try to do code-switching. It seems perfect.

#45
by ryL - opened

Input audio:

Transcription:

I think I am trying to speak a three kind of language simultaneously.ζˆ‘εˆθ¬›δΈ­ζ–‡γ€εˆθ¬›θ‹±ζ–‡γ€‚Sometimes I can speak ζ—₯本θͺžγ€‚ζ—₯本θͺžγ―θ©±γ›γ§γγΎγ™γ€‚γŠγ―γ‚ˆγ†γ”γ–γ„γΎγ™γ€‚γ•γ‚ˆγ†γͺら。

How did you bypass the singular language detection? It seems it is selecting a single language for each audio clip when I use it.

It seems it is selecting a single language for each audio clip when I use it

This is the expected behaviour! Not sure how @ryL got multiple language outputs?

could you please share your code? @ryL

HOw could you get multiple languages in your transcriptions?

I would also be very interested in knowing how @ryL got it to work for a single audio clip!

For anyone stumbling here - I made it work using an additional prompt:

You are a professional transcriber, fluent in language1 and language2.
You are listening to a recording in which a person is potentially speaking both language1 and language2, and no other languages.
They may be speaking only one of these languages. They may have a strong accent.
You are to transcribe utterances of each language accordingly.

@odusseys It seems like you used Whisper in GPT, did you?
@ryL could you share the code that got you code-switching output in Whisper? Or maybe what approach you took for the same?

Using whisperx directly could get the the same transcription, almost perfect.

Sign up or log in to comment