A newer version of the Gradio SDK is available: 6.14.0
metadata
title: VoiceVerse AI
emoji: ποΈ
colorFrom: indigo
colorTo: purple
sdk: gradio
sdk_version: 5.23.1
python_version: '3.10'
app_file: app.py
pinned: false
ποΈ VoiceVerse AI β Document to Audio
Transform uploaded documents into engaging, emotionally expressive podcast-style audio narrations.
Pipeline
PDF/TXT β Text Extraction β RAG (chunk + embed + retrieve) β Script Generation (Mistral-7B) β TTS (Qwen3-TTS / Edge-TTS) β Audio Playback
Models Used
| Component | Model | How |
|---|---|---|
| Embeddings | all-MiniLM-L6-v2 |
Local (CPU) |
| Script Gen | Mistral-7B-Instruct-v0.3 |
HF Inference API |
| TTS (primary) | Qwen3-TTS |
HF Inference API |
| TTS (fallback) | Edge-TTS (AriaNeural) |
Local (CPU) |
Setup
pip install -r requirements.txt
export HF_TOKEN="your_huggingface_token_here"
python app.py
Deployment on HF Spaces
- Create a new Space (Gradio SDK)
- Upload all project files
- Set
HF_TOKENas a Space Secret - The app will auto-launch on port 7860
Project Structure
app.py # Gradio UI entry point
rag.py # Document ingestion, chunking, embedding, retrieval
script_gen.py # LLM script generation (Mistral-7B-Instruct)
tts.py # Text-to-speech (Qwen3-TTS + Edge-TTS fallback)
utils.py # Helpers (temp files, validation, error formatting)
requirements.txt # Python dependencies
packages.txt # System packages (ffmpeg)