syedusama5556
/

WhisperSpeechTTSRVCPipline

Model card Files Files and versions Community

WhisperSpeechTTSRVCPipline / README.md

syedusama5556's picture

Update README.md

9570840 verified 5 months ago

|

raw history blame

No virus

1.16 kB

	---
	license: gpl-3.0
	---
	# WhisperSpeechRVCPipline


	Zero-Shot AI Voice Cloning TTS With WhisperSpeech And RVC Pipeline


	<!-- WARNING: THIS FILE WAS AUTOGENERATED! DO NOT EDIT! -->

	[![Test it out yourself in
	Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1xxGlTbwBmaY6GKA24strRixTXGBOlyiw)
	*If you have questions or you want to help you can find us in the
	\#audio-generation channel on the LAION Discord server.*

	An Open Source text-to-speech system built by inverting Whisper.
	Previously known as spear-tts-pytorch.

	We want this model to be like Stable Diffusion but for speech – both
	powerful and easily customizable.

	We are working only with properly licensed speech recordings and all the
	code is Open Source so the model will be always safe to use for
	commercial applications.

	Currently the models are trained on the English LibreLight dataset. In
	the next release we want to target multiple languages (Whisper and
	EnCodec are both multilanguage).

	Sample of the synthesized voice:

	https://github.com/collabora/WhisperSpeech/assets/107984/aa5a1e7e-dc94-481f-8863-b022c7fd7434