Spaces:
Sleeping
Sleeping
| license: apache-2.0 | |
| title: 'AI VoiceCraft: Text-to-Speech Studio' | |
| sdk: gradio | |
| emoji: π | |
| colorFrom: blue | |
| colorTo: green | |
| pinned: true | |
| short_description: A powerful web app to generate dynamic text content and conv | |
| # ποΈ AI VoiceCraft: Text-to-Speech Studio π | |
| ## Overview | |
| AI VoiceCraft is a powerful web application built with Gradio that leverages cutting-edge AI to generate dynamic text content and transform it into natural-sounding speech. This tool integrates the Gemini AI model for content generation and Microsoft Edge TTS for high-quality audio synthesis. | |
| ## Features | |
| - **Dynamic Content Generation:** | |
| - Generate various content types, including stories, news, podcasts, and more. | |
| - Customize content length, theme, and style. | |
| - Utilize Gemini AI for creative and contextually relevant text output. | |
| - **High-Quality Text-to-Speech:** | |
| - Leverage Microsoft Edge TTS for realistic voice synthesis. | |
| - Support for multiple languages and voices. | |
| - Fine-tune speech rate and pitch for optimal delivery. | |
| - **User-Friendly Interface:** | |
| - Intuitive Gradio interface for easy navigation and control. | |
| - Real-time feedback and error handling. | |
| - Attractive theme applied for better user experience. | |
| - **Customization Options:** | |
| - Adjust the creativity level of the AI content generation. | |
| - Input custom prompts for fine-tuning the AI outputs. | |
| - Adjust speech rate and pitch to fit your needs. | |
| ## Getting Started | |
| ### Prerequisites | |
| - Python 3.7+ | |
| - Internet connection (for API access and TTS) | |
| - API Key for Gemini Model. | |
| ### Installation | |
| 1. Clone the repository: | |
| ```bash | |
| git clone https://github.com/musabbirkm/ContentVoiceGen.git | |
| cd ContentVoiceGen | |
| ``` | |
| 2. Install the required Python packages: | |
| ```bash | |
| pip install gradio requests edge-tts google-generativeai nest_asyncio | |
| ``` | |
| 3. Set your API key in the VOCALIS.py file. | |
| 4. Run the application: | |
| ```bash | |
| python app.py | |
| ``` | |
| 5. Open your web browser and navigate to the local URL provided by Gradio (usually `http://127.0.0.1:7860`). | |
| ## Usage | |
| 1. Select the desired content type from the dropdown menu. | |
| 2. Choose the language and voice for the TTS output. | |
| 3. Adjust the output style, content length, and theme as needed. | |
| 4. Enter any custom text or instructions in the customization field. | |
| 5. Adjust the speech rate and pitch using the sliders. | |
| 6. Click the "Submit" button to generate the text and audio. | |
| 7. Review the generated text and listen to the audio output. | |
| ## Code Structure | |
| - `your_script_name.py`: Main application script that integrates Gradio, content generation, and TTS. | |
| - `VOCALIS.py`: Contains the `Agent` and `ContentGenerator` classes for AI content generation. | |
| - `edgeTTsLang.py`: Dictionary containing the language and voice codes for Microsoft Edge TTS. | |
| ## Dependencies | |
| - `gradio`: For building the web interface. | |
| - `requests`: For making HTTP requests to the API. | |
| - `edge-tts`: For text-to-speech conversion. | |
| - `google-generativeai`: For interacting with the Gemini AI model. | |
| - `asyncio`: For asynchronous operations. | |
| - `nest_asyncio`: For handling nested asyncio events in Jupyter notebooks. | |
| ## Contributing | |
| Contributions are welcome! Please feel free to submit pull requests or open issues for bug fixes, feature requests, or improvements. | |
| ## License | |
| This project is licensed under the Apache Version 2.0 | |
| Apache 2.0 | |
| To enhance the user experience, an attractive theme has been applied to the Gradio interface. You can customize the theme further by modifying the Gradio theme settings in the `create_demo` function. |