Spaces:
Running
on
CPU Upgrade
Running
on
CPU Upgrade
Reviews: February models
#36
by
Pendrokar
- opened
Post your reviews of the February TTS models here. The models are:
- ElevenLabs
- XTTSv2
- OpenVoice
- MetaVoice
- WhisperSpeech
- Pheme
I wanted to come out bashing WhisperSpeech and Pheme. But actually they all have their own pros and cons.
Short review.
- ElevenLabs - Super clear, studio quality, even after being downsampled to 24kHz by TTS-Arena. Loses to others if delivery is more monotone than competitor.
- XTTSv2 - Clear, but not the best voice. Great narration. Sometimes cuts a part of a word by the end of sentence. Overall gives quality close to ElevenLabs.
- OpenVoice - Clear, but often monotone.
- MetaVoice - Muffled voice. Can hallucinate at the end of sample.
- WhisperSpeech - Low stability and can have cutoffs and hallucinate at the end of sample.
- Pheme - Very bad voice quality. Very unstable, cutoffs... however... I finally understand those Harvard sentences and why they have so few commas. There are times when Pheme correctly pauses mid-sentence, making the sentence more comprehensible. ElevenLabs never does, plows right through.
For me, I have to automatically choose the competitor when WhisperSpeech or MetaVoice hallucinates something when it needs to be silent.
Pendrokar
changed discussion title from
February model personal reviews
to Reviews: February models
Severe case of Meta voice hallucinating:
Frankly some of the low performing models should be disabled until they get updated/fixed.
Pendrokar
changed discussion status to
closed