Tortoise TTS generate voice that read in syllables and sounds not so close to examples
Hello. I'm trying to use your model with tortoise-tts but it sounds odd.
I added ruslan.pth to the .model directory and tried to run it with this command:
python tortoise/do_tts.py --text "Мне всегда готовы предоставить работу, которая обеспечит нормальное биологическое существование." --voice ruslan --preset fast --model_dir .model
and with this Google colab
I also tried to use it with preset high_quality, and use it with 1000 examples in tortoise/voices/ruslan
But it doesn't help at all.
How can I improve the quality of the generated voice with this model? Should I use special parameters for do_tts?
Hey, @drewdru ! Here's a sample that I quickly got with random samples using https://git.ecker.tech/mrq/ai-voice-cloning/ .
Are you getting results similar to the above?
The result wasn't as good as yours.
Where should I put ruslan.pth
? Is this a right model path: ai-voice-cloning/models/tortoise/ruslan.pth
?
After generation I got extra model: ai-voice-cloning/voices/ruslan/cond_latents_d1f79232.pth
?
How many audio files do you use in voices
directory? If you use all dataset files here, can you provide this model too?
Thank you ^o^
I added ruslan.pth to models/finetunes/ruslan.pth, On settings tab selected it as Autoregressive Model, selected "Model (Re)Load TTS".
Now it works great