Frequent Appearance of <unk> Tokens in Translations to Mandarin (cmn & cmn_Hant) and Cantonese (yue)

#20
by tanshuai - opened

The output often includes the <unk> token when translating phrases containing interjections or other specific words. For instance, "Oh, Peter" becomes "<unk>,彼得." and "Oh, my God" is translated to "<unk>,我的上帝." Importantly, this issue is not limited to the word "Oh"; various other words are also translated into <unk>, severely impacting the project's usability for Mandarin and Cantonese translations.

Sign up or log in to comment