Parler TTS
AI & ML interests
None defined yet.
Recent Activity
Parler-TTS
Parler-TTS is a lightweight text-to-speech (TTS) model that can generate high-quality, natural sounding speech in the style of a given speaker (gender, pitch, speaking style, etc). It is a reproduction of work from the paper Natural language guidance of high-fidelity text-to-speech with synthetic annotations by Dan Lyth and Simon King, from Stability AI and Edinburgh University respectively.
Contrary to other TTS models, Parler-TTS is a fully open-source release. All of the datasets, pre-processing, training code, and weights are released publicly under a permissive license, enabling the community to build on our work and develop their own powerful TTS models. It consists in:
- The Parler-TTS library for using and training high-quality TTS models.
- The Data-Speech repository, for annotating speech characteristics in a large-scale setting.
- This organization, that contains the released datasets and weights.
šØ Two new checkpoints, Parler-TTS Mini v1.1 and Large v1, are out! šØ Trained on 45k hours of narrated audio, they're better and faster than previous versions, and introduce speaker consistency across generations. Try them out here š¤!