Spaces:

viviztech
/

kani-tts-web-app

Build error

App Files Files Community

viviztech commited on 6 days ago

Commit

af41bb2

verified ·

1 Parent(s): fa83b31

Upload 3 files

Browse files

Files changed (3) hide show

README.md +66 -6
app.py +57 -0
requirements.txt +6 -0

README.md CHANGED Viewed

@@ -1,12 +1,72 @@
 ---
-title: Kani Tts Web App
-emoji: 🦀
-colorFrom: purple
-colorTo: green
 sdk: gradio
-sdk_version: 6.0.2
 app_file: app.py
 pinned: false
 ---
-Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

 ---
+title: KaniTTS Web App
+emoji: 🎤
+colorFrom: blue
+colorTo: purple
 sdk: gradio
+sdk_version: 4.44.0
 app_file: app.py
 pinned: false
+license: mit
 ---
+# KaniTTS Web App
+A text-to-speech web application using KaniTTS model with Gradio interface.
+## Prerequisites
+- Python 3.10 or higher
+- PyTorch 2.8.0 or higher (requires Linux or Apple Silicon Mac)
+> **Note**: This app requires PyTorch 2.8+. macOS x86_64 (Intel) only supports up to PyTorch 2.2.2. For Intel Macs, please use a cloud platform.
+## Cloud Deployment
+### Google Colab
+1. Upload the files to Google Drive or clone from GitHub
+2. Run: `!pip install -r requirements.txt`
+3. Run: `!python app.py`
+### Hugging Face Spaces
+1. Create a new Space with Gradio SDK
+2. Upload `app.py` and `requirements.txt`
+3. The app will automatically deploy
+### Replit
+1. Create a new Python Repl
+2. Upload the files
+3. Install dependencies: `pip install -r requirements.txt`
+4. Run: `python app.py`
+## Local Installation (Linux/Apple Silicon)
+```bash
+# Clone the repository
+git clone <your-repo-url>
+cd tts-app
+# Create virtual environment
+python3 -m venv venv
+source venv/bin/activate
+# Install dependencies
+pip install -r requirements.txt
+# Run the application
+python app.py
+```
+## Usage
+1. Open your browser and go to: `http://127.0.0.1:7860`
+2. Enter the text you want to convert to speech
+3. Click the "Generate" button
+4. Listen to the generated audio
+## Model Information
+- **Model**: `nineninesix/kani-tts-400m-en`
+- **Sample Rate**: 22050Hz (model default)
+- **Language**: English

app.py ADDED Viewed

	@@ -0,0 +1,57 @@

+import gradio as gr
+from kani_tts import KaniTTS
+import soundfile as sf
+import numpy as np
+import tempfile
+import os
+# Initialize the KaniTTS model
+print("Loading KaniTTS model...")
+model = KaniTTS('nineninesix/kani-tts-400m-en')
+print("Model loaded successfully!")
+def generate_speech(text):
+    """Generate speech from text using KaniTTS model."""
+    if not text or not text.strip():
+        return None
+    try:
+        # Generate audio from text
+        audio, _ = model(text)
+        # Get sample rate from model (typically 22050Hz or 24000Hz)
+        sample_rate = model.sample_rate
+        # Save to temporary file
+        temp_file = tempfile.NamedTemporaryFile(suffix=".wav", delete=False)
+        sf.write(temp_file.name, audio, sample_rate)
+        return temp_file.name
+    except Exception as e:
+        print(f"Error generating speech: {e}")
+        return None
+# Create Gradio interface
+with gr.Blocks(title="KaniTTS Web App") as demo:
+    gr.Markdown("# KaniTTS Web App")
+    gr.Markdown("Enter text below and click Generate to convert it to speech.")
+    with gr.Row():
+        text_input = gr.Textbox(
+            label="Text Input",
+            placeholder="Enter text to convert to speech...",
+            lines=3
+        )
+    generate_btn = gr.Button("Generate", variant="primary")
+    audio_output = gr.Audio(label="Generated Audio", type="filepath")
+    generate_btn.click(
+        fn=generate_speech,
+        inputs=text_input,
+        outputs=audio_output
+    )
+if __name__ == "__main__":
+    demo.launch(server_port=7860)

requirements.txt ADDED Viewed

	@@ -0,0 +1,6 @@

+kani-tts>=0.0.4
+gradio>=4.0.0
+torch>=2.8.0
+soundfile>=0.12.0
+numpy>=1.24.0
+transformers>=4.57.0