Can't seem to handle full sentence speech+low performance on single word translations

#10
by JeffyHsieh - opened

native Taiwanese Hokkien Speaker here
I tried a couple examples in both En->Hokkien and Hokkien->En

  1. it can't seem to handle short sentences (e.g. "Have you eaten yet?" or "The weather is nice today.") Output is usually a short word or an unintelligible sound
  2. single word translation also not very good. En->Hokkien only worked like 2/20 times, Hokkien->En slightly better, but sometimes there are a lot of repetitions (e.g. 暗頓->dinner and dinner.)
    Fun fact, 10/20 times the En->Hok translation gives me 笑啥

I was wondering if you could add throttles to try different sampling temperature/top-p or some other inference parameters.

Hi, thanks for the feedback. This is the point we are going to improve since our models are not very robust with short sentences, so the results are often not very good if users play with the demo with short audios.

Sign up or log in to comment