Parler TTS

community
Activity Feed

AI & ML interests

None defined yet.

Recent Activity

Parler-TTS

Parler-TTS is a lightweight text-to-speech (TTS) model that can generate high-quality, natural sounding speech in the style of a given speaker (gender, pitch, speaking style, etc). It is a reproduction of work from the paper Natural language guidance of high-fidelity text-to-speech with synthetic annotations by Dan Lyth and Simon King, from Stability AI and Edinburgh University respectively.

Contrary to other TTS models, Parler-TTS is a fully open-source release. All of the datasets, pre-processing, training code, and weights are released publicly under a permissive license, enabling the community to build on our work and develop their own powerful TTS models. It consists in:

šŸšØ Two new checkpoints, Parler-TTS Mini v1.1 and Large v1, are out! šŸšØ Trained on 45k hours of narrated audio, they're better and faster than previous versions, and introduce speaker consistency across generations. Try them out here šŸ¤—!