Spaces:
Running
on
CPU Upgrade
Problem with OpenVoice
Hello - Thanks for making this project. It's very useful for evaluating TTS model from a human perception perspective.
I am the author of OpenVoice. I noticed that the OpenVoice samples in the comparisons have some artifacts, and I would love to help you get the right configuration and fix the bugs.
One problem might be the reference audio you used is not clean enough or less than 30s. Could you try some clean voices such as
https://aiartes.com/records/aerith_original.mp3
or https://aiartes.com/records/aloy_original.mp3
or some other samples in https://aiartes.com/voiceai.
If OpenVoice has a clean reference audio as input, it should be able to generate very high-quality audio.
Very much appreciated.
I will look into this, thank you for letting us know!
Hi, I updated the reference voices. Please let me know if it works!
Thanks @mrfakename โค๏ธ
I listened to a couple of examples and there're still some artifacts. Could you use this one instead?
@mrfakename
Hi, the audio now uses that sample. Please let me know if it works now!
Thanks. It seems that the current version of OpenVoice tends to have artifacts for short sentences. We will fix this issue and let you know when done
Wait... hold up... @mrfakename you, @reach-vb and the team should really define the rules as these are all copyrighted samples used for insta-cloning. Everyone might as well just fine-tune their voice model against Scarlett Johanson from the movie "Her" and this would be a competition of the best impersonator. ๐
[Edit] Though @reach-vb did clarify "In general, we're biased towards recent + open access models which have been trained on more than just LJSpeech or VCTK.", but I just didn't think it would mean training voices on copyrighted materials. ๐
@ZenQin also all the posted samples don't have a pause between sentences. Meaning samples of the game audio were merged into one without padding. So I would expect artifacts.
@mrfakename Could you help change to this reference voice instead? It should make the model work more stably for now. Very much appreciated. We will fix the issue I mentioned in my previous reply in OpenVoice V2 and improve the overall quality
Zen, now you are definitely trolling. ๐
Thanks! I'll close the issue for now