Parler-TTS: fully open-source high-quality TTS

parler-tts 's Collections

Open-source speech datasets annotated using Data-Speech

Parler-TTS: fully open-source high-quality TTS

updated Dec 2, 2024

If you want to find out more about how these models were trained and even fine-tune them yourself, check-out the Parler-TTS repository on GitHub.

Upvote

Running on Zero

790

790

Parler-TTS

🥖

High-fidelity Text-To-Speech
parler-tts/parler-tts-mini-v1.1

Text-to-Speech • Updated Oct 30, 2024 • 2.19k • 14

Note Parler-TTS Mini v1.1 is a 938M parameters Parler checkpoint, trained on 45K hours of audio data. The only change with v1 is the use of a better prompt tokenizer. This tokenizer has a larger vocabulary and handles byte fallback, which simplifies multilingual training.
parler-tts/parler-tts-large-v1

Text-to-Speech • Updated Nov 22, 2024 • 20.1k • 237

Note Parler-TTS Large is a 2.2B-parameters Parler checkpoint, trained on 45K hours of audio data.
parler-tts/parler-tts-mini-v1

Text-to-Speech • Updated Nov 25, 2024 • 28.5k • 133

Note Parler-TTS Mini is a 880M parameters Parler checkpoint, trained on 45K hours of audio data.
Natural language guidance of high-fidelity text-to-speech with synthetic annotations

Paper • 2402.01912 • Published Feb 2, 2024 • 11
parler-tts/parler_tts_mini_v0.1

Text-to-Speech • Updated Apr 30, 2024 • 15.8k • 350

Note Parler-TTS v0.1 is a lightweight text-to-speech (TTS) model, trained on 10.5K hours of audio data, that can generate high-quality, natural sounding speech with features that can be controlled using a simple text prompt (e.g. gender, background noise, speaking rate, pitch and reverberation). It is the first release model from the Parler-TTS project, which aims to provide the community with TTS training resources and dataset pre-processing code. V1 coming soon!
parler-tts/dac_44khZ_8kbps

Updated Apr 10, 2024 • 456 • 17

Note Used to recover the audio waveform from the audio tokens predicted by the decoder. We use the DAC model from Descript, although other codec models, such as EnCodec, can also be used.
google/flan-t5-base

Text2Text Generation • Updated Jul 17, 2023 • 788k • • 834

Note Used to encode text descriptions.

Upvote