--- license: other license_name: coqui-public-model-license license_link: https://coqui.ai/cpml library_name: coqui pipeline_tag: text-to-speech widget: - text: "Once when I was six years old I saw a magnificent picture" --- # XTTS v2 Fine-Tuned on Hindi Datasets **Model Name**: XTTS v2 Fine-Tuned on Hindi Datasets **Model Description**: This is a fine-tuned version of the XTTS v2 (Cross-lingual Text-to-Speech) model developed by Coqui-AI, specifically fine-tuned on Hindi speech datasets to improve performance in generating natural and accurate Hindi speech. The model supports a range of features including voice cloning and multilingual speech generation. ### Colab Notebook You can view the Colab notebook used for fine-tuning the XTTS v2 model on Hindi datasets and replicate the process by following this [Colab Notebook Link](https://colab.research.google.com/drive/1VwNltFIcqhB7Ydt4NVaPnYegl-qHoUSO#scrollTo=KKj-kq7iCG3d). ### Features - **Languages**: Supports 16 languages including Hindi (hi). - **Voice Cloning**: Clone voices with just a 6-second audio clip. - **Emotion and Style Transfer**: Achieve emotion and style transfer by cloning. - **Cross-Language Voice Cloning**: Supports voice cloning across different languages. - **Sampling Rate**: 24kHz sampling rate for high-quality audio. ### Updates over XTTS-v1 - **New Languages**: Added support for Hungarian and Korean. - **Architectural Improvements**: Enhanced speaker conditioning and interpolation. - **Stability Improvements**: Better overall stability and performance. - **Audio Quality**: Improved prosody and audio quality. ### Languages The XTTS-v2 model supports 17 languages including: - **English (en)** - **Spanish (es)** - **French (fr)** - **German (de)** - **Italian (it)** - **Portuguese (pt)** - **Polish (pl)** - **Turkish (tr)** - **Russian (ru)** - **Dutch (nl)** - **Czech (cs)** - **Arabic (ar)** - **Chinese (zh-cn)** - **Japanese (ja)** - **Hungarian (hu)** - **Korean (ko)** - **Hindi (hi)** ### Training Data The model was fine-tuned on the following Hindi datasets: - **Mozilla CommonVoice 18**: A diverse dataset of Hindi speech. - **IndicTTS Hindi Dataset**: Hindi speech data for text-to-speech training. ### Code The [code-base](https://github.com/coqui-ai/TTS) supports both inference and [fine-tuning](https://tts.readthedocs.io/en/latest/models/xtts.html#training). ### Demo Spaces - [XTTS Space](https://huggingface.co/spaces/coqui/xtts): Explore the model's performance on supported languages and try it with your own reference or microphone input. - [XTTS Voice Chat with Mistral or Zephyr](https://huggingface.co/spaces/coqui/voice-chat-with-mistral): Experience streaming voice chat with Mistral 7B Instruct or Zephyr 7B Beta. ### License This model is licensed under the [Coqui Public Model License](https://coqui.ai/cpml). Read more about the [origin story of CPML here](https://coqui.ai/blog/tts/cpml). ### Contact Join our 🐸 Community on [Discord](https://discord.gg/fBC58unbKE) and follow us on [Twitter](https://twitter.com/coqui_ai). For inquiries, you can also email us at info@coqui.ai. ### Usage #### Using 🐸TTS API ```python from TTS.api import TTS # Load the model tts = TTS("tts_models/multilingual/multi-dataset/xtts_v2", gpu=True) # Generate speech by cloning a voice using default settings tts.tts_to_file( text="It took me quite a long time to develop a voice, and now that I have it I'm not going to be silent.", file_path="output.wav", speaker_wav="/path/to/target/speaker.wav", language="hi" )