syedusama5556
commited on
Commit
•
9570840
1
Parent(s):
57ba65d
Update README.md
Browse files
README.md
CHANGED
@@ -1,3 +1,33 @@
|
|
1 |
---
|
2 |
license: gpl-3.0
|
3 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
---
|
2 |
license: gpl-3.0
|
3 |
---
|
4 |
+
# WhisperSpeechRVCPipline
|
5 |
+
|
6 |
+
|
7 |
+
**Zero-Shot AI Voice Cloning TTS With WhisperSpeech And RVC Pipeline**
|
8 |
+
|
9 |
+
|
10 |
+
<!-- WARNING: THIS FILE WAS AUTOGENERATED! DO NOT EDIT! -->
|
11 |
+
|
12 |
+
[![Test it out yourself in
|
13 |
+
Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1xxGlTbwBmaY6GKA24strRixTXGBOlyiw)
|
14 |
+
*If you have questions or you want to help you can find us in the
|
15 |
+
\#audio-generation channel on the LAION Discord server.*
|
16 |
+
|
17 |
+
An Open Source text-to-speech system built by inverting Whisper.
|
18 |
+
Previously known as **spear-tts-pytorch**.
|
19 |
+
|
20 |
+
We want this model to be like Stable Diffusion but for speech – both
|
21 |
+
powerful and easily customizable.
|
22 |
+
|
23 |
+
We are working only with properly licensed speech recordings and all the
|
24 |
+
code is Open Source so the model will be always safe to use for
|
25 |
+
commercial applications.
|
26 |
+
|
27 |
+
Currently the models are trained on the English LibreLight dataset. In
|
28 |
+
the next release we want to target multiple languages (Whisper and
|
29 |
+
EnCodec are both multilanguage).
|
30 |
+
|
31 |
+
Sample of the synthesized voice:
|
32 |
+
|
33 |
+
https://github.com/collabora/WhisperSpeech/assets/107984/aa5a1e7e-dc94-481f-8863-b022c7fd7434
|