|
--- |
|
license: gpl-3.0 |
|
--- |
|
# WhisperSpeechRVCPipline |
|
|
|
|
|
**Zero-Shot AI Voice Cloning TTS With WhisperSpeech And RVC Pipeline** |
|
|
|
|
|
<!-- WARNING: THIS FILE WAS AUTOGENERATED! DO NOT EDIT! --> |
|
|
|
[![Test it out yourself in |
|
Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1xxGlTbwBmaY6GKA24strRixTXGBOlyiw) |
|
*If you have questions or you want to help you can find us in the |
|
\#audio-generation channel on the LAION Discord server.* |
|
|
|
An Open Source text-to-speech system built by inverting Whisper. |
|
Previously known as **spear-tts-pytorch**. |
|
|
|
We want this model to be like Stable Diffusion but for speech – both |
|
powerful and easily customizable. |
|
|
|
We are working only with properly licensed speech recordings and all the |
|
code is Open Source so the model will be always safe to use for |
|
commercial applications. |
|
|
|
Currently the models are trained on the English LibreLight dataset. In |
|
the next release we want to target multiple languages (Whisper and |
|
EnCodec are both multilanguage). |
|
|
|
Sample of the synthesized voice: |
|
|
|
https://github.com/collabora/WhisperSpeech/assets/107984/aa5a1e7e-dc94-481f-8863-b022c7fd7434 |