Russian language TTS model misplaces word stress on certain words, affecting pronunciation accuracy

#23
by AndreySokolov01 - opened

The Russian-language TTS model occasionally places lexical stress on the wrong syllable in certain words, resulting in unnatural or incorrect pronunciation. This issue is most noticeable in:
Multi-syllabic words with non-default stress patterns
Homographs where stress changes meaning (e.g., зАмок / замОк)
Loanwords and proper nouns
While the model generally produces intelligible speech, incorrect stress reduces naturalness and may cause confusion in professional or educational contexts.

Words should be pronounced with correct lexical stress according to standard Russian orthoepy:
дОговор (not договОр)
звОнит (not звонИт)
кУхонный (not кухОнный)
тОрты (not тортЫ)

Russian is a stress-accent language where stress is phonemic (changes meaning) and unpredictable from spelling alone.
The model does not appear to leverage external stress dictionaries (e.g., OpenCorpora, Zaliznyak's grammar) or context-aware disambiguation.

Sign up or log in to comment