indicF5 / README.md
ashishkblink's picture
Update README.md
11f0ab3 verified

A newer version of the Gradio SDK is available: 6.10.0

Upgrade
metadata
title: Vakya 2.0 - Text-to-Speech
emoji: πŸŽ™οΈ
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: 6.2.0
app_file: app.py
pinned: false
license: mit

πŸŽ™οΈ Vakya 2.0 - Text-to-Speech Playground

Vakya is a high-quality Text-to-Speech model based on the IndicF5 architecture, supporting 11 Indian languages.

🌟 Features

  • Multi-language Support: Assamese, Bengali, Gujarati, Hindi, Kannada, Malayalam, Marathi, Odia, Punjabi, Tamil, Telugu
  • Voice Cloning: Uses reference audio to clone voice characteristics
  • High Quality: 24kHz sample rate, 0.4B parameter model
  • Easy to Use: Simple interface for testing and experimentation

πŸš€ How to Use

  1. Load Model: Click the "Load Model" button (first time may take a few minutes to download)
  2. Upload Reference Audio: Upload a short audio clip (<15 seconds recommended) that represents the voice you want to clone
  3. Enter Reference Text (Optional): Type what is spoken in the reference audio. If left blank, the model will auto-transcribe it
  4. Enter Text to Generate: Type the text you want to synthesize in any supported language
  5. Adjust Settings (Optional):
    • Speed: Control the speech rate (0.5x to 2.0x)
    • Remove Silences: Experimental feature to remove pauses
  6. Generate: Click "Generate Speech" and wait for the audio output

πŸ“‹ Model Information

  • Model: Vakya 2.0
  • Repository: ashishkblink/vakya2.0
  • Based on: IndicF5 by AI4Bharat (IIT Madras)
  • Model Size: 0.4B parameters
  • Sample Rate: 24000 Hz
  • Training Data: 1417 hours of high-quality speech
  • License: MIT License

πŸ’‘ Tips for Best Results

  • Keep reference audio clips short (<15 seconds) for best results
  • Use clear, high-quality reference audio
  • Provide reference text when possible for better voice matching
  • The model works best with native speakers of the target language

⚠️ Terms of Use

  • You must have explicit permission to clone voices
  • Unauthorized voice cloning is strictly prohibited
  • Any misuse of this model is the responsibility of the user
  • This model is for research and educational purposes

πŸ”— Links

πŸ™ Acknowledgments

This model is based on IndicF5 developed by AI4Bharat (IIT Madras).


Vakya - Bringing voices to Indian languages πŸŽ™οΈ