dangvansam
/

viet-tts

@@ -1,30 +1,47 @@
----
-language:
-- vi
-- en
-pipeline_tag: text-to-speech
----
 <!-- # VietTTS: An Open-Source Vietnamese Text to Speech -->
 <p align="center">
-  <img src="https://github.com/dangvansam/viet-tts/blob/main/assets/viet-tts-medium.png?raw=true" style="width: 22%">
   <h1 align="center"style="color: white; font-weight: bold; font-family:roboto"><span style="color: white; font-weight: bold; font-family:roboto">VietTTS</span>: An Open-Source Vietnamese Text to Speech</h1>
 </p>
 <p align="center">
   <a href="https://github.com/dangvansam/viet-tts"><img src="https://img.shields.io/github/stars/dangvansam/viet-tts?style=social"></a>
 </p>
 **VietTTS** is an open-source toolkit providing the community with a powerful Vietnamese TTS model, capable of natural voice synthesis and robust voice cloning. Designed for effective experimentation, **VietTTS** supports research and application in Vietnamese voice technologies.
 ## ⭐ Key Features
 - **TTS**: Text-to-Speech generation with any voice via prompt audio
-- **VC**: Voice Conversion (TODO)
 ## 🛠️ Installation
-VietTTS can be installed via either a Python installer or Docker.
-### Python Installer
 ```bash
 git clone https://github.com/dangvansam/viet-tts.git
 cd viet-tts
@@ -52,11 +69,8 @@ docker compose build
 # Run with docker-compose - will create server at: http://localhost:8298
 docker compose up -d
-# Run with docker run - will create server at: http://localhost:8298
 docker run -itd --gpu=alls -p 8298:8298 -v ./pretrained-models:/app/pretrained-models -n viet-tts-service viet-tts:latest viettts server --host 0.0.0.0 --port 8298
-# Show available voices
-docker exec viet-tts-service viettts show-voices
 ```
 ## 🚀 Usage
@@ -106,11 +120,14 @@ viettts --help
 # Start API Server
 viettts server --host 0.0.0.0 --port 8298
-# Synthesis speech from text
-viettts synthesis --text "Xin chào" --voice 0 --output test.wav
 # List all built-in voices
 viettts show-voices
 ```
 ### API Client
@@ -142,14 +159,24 @@ with client.audio.speech.with_streaming_response.create(
 #### CURL
 ```bash
 curl http://localhost:8298/v1/audio/speech \
-  -H "Authorization: Bearer viet-tts" \
-  -H "Content-Type: application/json" \
-  -d '{
-    "model": "tts-1",
-    "input": "Xin chào Việt Nam.",
-    "voice": "son-tung-mtp"
-  }' \
   --output speech.wav
 ```

+---
+language:
+- vi
+- en
+pipeline_tag: text-to-speech
+license: apache-2.0
+tags:
+- tts
+- text-to-speech
+- vietnamese
+- speech-synthesis
+- speech,
+- viet-tts
+- viettts
+---
 <!-- # VietTTS: An Open-Source Vietnamese Text to Speech -->
 <p align="center">
+  <img src="assets/viet-tts-medium.png" style="width: 200px">
   <h1 align="center"style="color: white; font-weight: bold; font-family:roboto"><span style="color: white; font-weight: bold; font-family:roboto">VietTTS</span>: An Open-Source Vietnamese Text to Speech</h1>
 </p>
 <p align="center">
   <a href="https://github.com/dangvansam/viet-tts"><img src="https://img.shields.io/github/stars/dangvansam/viet-tts?style=social"></a>
+  <a href="https://huggingface.co/dangvansam/viet-tts"><img src="https://img.shields.io/badge/%F0%9F%A4%97HuggingFace-Model-yellow"></a>
+  <a href="https://huggingface.co/dangvansam/viet-tts"><img src="https://img.shields.io/badge/%F0%9F%A4%97HuggingFace-Demo-green"></a>
+    <a href="https://github.com/dangvansam/viet-tts"><img src="https://img.shields.io/badge/Python-3.10-green"></a>
+    <!-- <a href="https://pypi.org/project/viet-tts" target="_blank"><img src="https://img.shields.io/pypi/v/viet-tts.svg" alt="PyPI Version"> -->
+    <a href="LICENSE"><img src="https://img.shields.io/github/license/dangvansam/viet-asr"></a>
+    </a>
+    <br>
+    <a href="README.md"><img src="https://img.shields.io/badge/README-English-blue"></a>
+    <a href="README_VN.md"><img src="https://img.shields.io/badge/README-Tiếng Việt-red"></a>
 </p>
 **VietTTS** is an open-source toolkit providing the community with a powerful Vietnamese TTS model, capable of natural voice synthesis and robust voice cloning. Designed for effective experimentation, **VietTTS** supports research and application in Vietnamese voice technologies.
 ## ⭐ Key Features
 - **TTS**: Text-to-Speech generation with any voice via prompt audio
+- **OpenAI-API-compatible**: Compatible with OpenAI's Text-to-Speech API format
 ## 🛠️ Installation
+VietTTS can be installed via a Python installer (Linux only, with Windows and macOS support coming soon) or Docker.
+### Python Installer (Python>=3.10)
 ```bash
 git clone https://github.com/dangvansam/viet-tts.git
 cd viet-tts
 # Run with docker-compose - will create server at: http://localhost:8298
 docker compose up -d
+# Or run with docker run - will create server at: http://localhost:8298
 docker run -itd --gpu=alls -p 8298:8298 -v ./pretrained-models:/app/pretrained-models -n viet-tts-service viet-tts:latest viettts server --host 0.0.0.0 --port 8298
 ```
 ## 🚀 Usage
 # Start API Server
 viettts server --host 0.0.0.0 --port 8298
 # List all built-in voices
 viettts show-voices
+# Synthesize speech from text with built-in voices
+viettts synthesis --text "Xin chào" --voice 0 --output test.wav
+# Clone voice from a local audio file
+viettts synthesis --text "Xin chào" --voice Download/voice.wav --output cloned.wav
 ```
 ### API Client
 #### CURL
 ```bash
+# Get all built-in voices
+curl --location http://0.0.0.0:8298/v1/voices
+# OpenAI format (bult-in voices)
 curl http://localhost:8298/v1/audio/speech \
+  -H "Authorization: Bearer viet-tts" \
+  -H "Content-Type: application/json" \
+  -d '{
+    "model": "tts-1",
+    "input": "Xin chào Việt Nam.",
+    "voice": "son-tung-mtp"
+  }' \
+  --output speech.wav
+# API with voice from local file
+curl --location http://0.0.0.0:8298/v1/tts \
+  --form 'text="xin chào"' \
+  --form 'audio_file=@"/home/viettts/Downloads/voice.mp4"' \
   --output speech.wav
 ```