Time stamp/

#1
by anticope - opened

Hello team, Amazing solution.

Just wondering how it's possible to process or build the audio to match the timestamp of the original text.
Let's consider the text is based of srt or vrt?

balacoon org

you can achieve it with purely signal processing techniques applied to the audio, for example with "speed" from sox (https://sox.sourceforge.net/sox.html).
Additionally, in the case of synthesis, you might have two options: 1) scale durations used in the synthesis of audio; 2) condition duration prediction on some global information about expected duration.
Both options are outside of typical TTS usage and usually are not supported by the default API.

clementruhm changed discussion status to closed

Sign up or log in to comment