Is there a solution to change the voice, accent, and the way the input sentence is spoken? For example, emphasising an one word, going fast at one word, going slow, etc.
is the TTS even working? its been showing as an invalid pipeline
I'd love to know if anyone solved this or knows if there are ways to include speaker notes. I'm hacking the best I can with punctuation and it works well, but occasionally I'd love to insert a [Pause] or other commands if they are available.
What are your thoughts on using TextEncoderPrenet and SpeechEncoderPrenet as inputs to enhance the speech synthesis process? These prenets would allow for the input text to be converted to speech and for reference audio to be used to determine style and tone.
Unfortunately, I am not yet proficient in this area as I am still learning.