Generates gibberish in some parts of YouTube Transcript

#33
by bayramg - opened

I recently I used this for this video https://www.youtube.com/watch?v=PDy7s1SDDn4 but it generated gibberish in some parts of transcript. Do you think, is that from Whisper itself?

Here is excerpt from transcript:

“ Felly, er mwyn ddefnyddio AI ychydig, Tesla defnyddia llawer.. Dim ond, ni'n mynd i'r hyn i'r sefyllfa, os yw'n iawn.. Un peth final o ran y gyrfedb o'r hyn rydych chi'n ei wneud gyda'ch bywyd, rydych chi'n rhoi tri pwynt o ddyniadau, rydych chi'n cael pwyntau fawr a control o'r ddwy o'r pwynt, ynglyn â'r ddyniadau.. Pa'n eich plan o'r afael os ydych chi'n arall gweithio ar y pwynt yr ydych chi'n ei wneud, of two of those at least.. What is your succession plan if you suddenly can't execute what you're doing, both in terms of who runs the companies, but as importantly, who votes those shares in terms of what happens longer term and strategically?. Have you got a plan for all of those?”

Hi there, the part of transcription given seems in Welsh. I tried to replicate the problem. It seems alright on my end.
image.png

It still does, yeah maybe it translates because neither Musk or Interviewer speaks Welsh. Timestamp is between [09:03.240 -> 09:06.000] - [09:43.000 -> 09:45.040]

Screen Shot 2023-06-15 at 11.51.30 PM.png

It also does worse to this video where speaker is bit thick accented Turkish but speaks English and it ruins the whole transcript with Turkish and English mixed even though he speaks English throughout the video. https://www.youtube.com/watch?v=uN68YtUI4t0

Sign up or log in to comment