Great Job! Feature request, important for Hebrew

by Forepick - opened


First of all thanks for the great work in this model! This is truly an important breakthrough!

One thing I noticed is missing (or I just didn't find yet) in Hebrew transcription is the punctuation ("Nikud").
While most of the latin (and oriental) languages writing system consists of both constants and vowels, in Hebrew - the exact vowels come from the punctuation marks like "Patach", "Segol" and "Hirik".

This is important mainly for words that can be read in more than one form such as the name 讗讜专讬, which without punctuation can be read as both "Uri" and "Ori".
While asking the model to transcribe the audio as English, it does bring you the vowels of course, but in many cases the transcription is just very wrong.

Do I miss anything? Or, if not, Please consider this as an important feature request.

Again, Thanks for the great work!

Sign up or log in to comment