indicF5

Sleeping

App Files Files Community

indicF5 / README.md

ashishkblink

Update README.md

11f0ab3 verified 3 months ago

preview code

raw

history blame contribute delete

2.67 kB

A newer version of the Gradio SDK is available: 6.10.0

Upgrade

metadata

title: Vakya 2.0 - Text-to-Speech
emoji: 🎙️
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: 6.2.0
app_file: app.py
pinned: false
license: mit

🎙️ Vakya 2.0 - Text-to-Speech Playground

Vakya is a high-quality Text-to-Speech model based on the IndicF5 architecture, supporting 11 Indian languages.

🌟 Features

Multi-language Support: Assamese, Bengali, Gujarati, Hindi, Kannada, Malayalam, Marathi, Odia, Punjabi, Tamil, Telugu
Voice Cloning: Uses reference audio to clone voice characteristics
High Quality: 24kHz sample rate, 0.4B parameter model
Easy to Use: Simple interface for testing and experimentation

🚀 How to Use

Load Model: Click the "Load Model" button (first time may take a few minutes to download)
Upload Reference Audio: Upload a short audio clip (<15 seconds recommended) that represents the voice you want to clone
Enter Reference Text (Optional): Type what is spoken in the reference audio. If left blank, the model will auto-transcribe it
Enter Text to Generate: Type the text you want to synthesize in any supported language
Adjust Settings (Optional):
- Speed: Control the speech rate (0.5x to 2.0x)
- Remove Silences: Experimental feature to remove pauses
Generate: Click "Generate Speech" and wait for the audio output

📋 Model Information

Model: Vakya 2.0
Repository: ashishkblink/vakya2.0
Based on: IndicF5 by AI4Bharat (IIT Madras)
Model Size: 0.4B parameters
Sample Rate: 24000 Hz
Training Data: 1417 hours of high-quality speech
License: MIT License

💡 Tips for Best Results

Keep reference audio clips short (<15 seconds) for best results
Use clear, high-quality reference audio
Provide reference text when possible for better voice matching
The model works best with native speakers of the target language

⚠️ Terms of Use

You must have explicit permission to clone voices
Unauthorized voice cloning is strictly prohibited
Any misuse of this model is the responsibility of the user
This model is for research and educational purposes

🔗 Links

Model Repository: ashishkblink/vakya2.0
GitHub: ashishkblink/vakya
IndicF5: AI4Bharat/IndicF5

🙏 Acknowledgments

This model is based on IndicF5 developed by AI4Bharat (IIT Madras).

Vakya - Bringing voices to Indian languages 🎙️