YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

SYSPIN Hackathon TTS API Documentation

Overview

This API provides a Text-to-Speech (TTS) service that converts input text into speech audio. It supports multiple Indian languages and offers voice customization through predefined male and female speaker references.


Endpoint: /Get_Inference

  • Method: GET
  • Description: Generates speech audio from the provided text using the specified language and speaker.

Query Parameters

Parameter Type Required Description
text string Yes The input text to be converted into speech.
lang string Yes The language of the input text. Acceptable values include: bhojpuri, bengali, english, gujarati, hindi, chhattisgarhi, kannada, magahi, maithili, marathi, telugu.
speaker string Yes The desired speaker's voice. Format: <language>_<gender>. For example: hindi_male, english_female. Refer to the available speakers below.

Available Speakers

Language Language codes Male Speaker Female Speaker
chhattisgarhi hne chhattisgarhi_male chhattisgarhi_female
kannada kn kannada_male kannada_female
maithili mai maithili_male maithili_female
telugu te telugu_male telugu_female
bengali bn bengali_male bengali_female
bhojpuri bho bhojpuri_male bhojpuri_female
marathi mr marathi_male marathi_female
gujarati gu gujarati_male gujarati_female
hindi hi hindi_male hindi_female
magahi mag magahi_male magahi_female
english en english_male english_female

Responses

  • 200 OK: Returns a WAV audio file as a streaming response containing the synthesized speech.

  • 422 Unprocessable Entity: Returned when:

    • Any of the required query parameters (text, lang, speaker) are missing.
    • The specified lang is not supported.
    • The specified speaker is not available.

Running the Server

To start the FastAPI server:

docker build -t your_image_name ./
docker run -d -v /path/to/this/code/dir/:/app/ -p 8080:8080 your_image_name API_main.py

Hosting on a GPU

To run your FastAPI-based Text-to-Speech (TTS) server inside a Docker container with GPU support, follow these steps:


Prerequisites

  1. NVIDIA GPU: Ensure your system has an NVIDIA GPU installed.

  2. NVIDIA Drivers: Install the appropriate NVIDIA drivers for your GPU.

  3. Docker: Install Docker on your system.

  4. NVIDIA Container Toolkit: Install the NVIDIA Container Toolkit to enable GPU support in Docker containers.


Installation Steps

1. Install NVIDIA Drivers

Ensure that the NVIDIA drivers compatible with your GPU are installed on your system.

2. Install Docker

If Docker is not already installed, you can install it by following the official Docker installation guide for your operating system.

3. Install NVIDIA Container Toolkit

The NVIDIA Container Toolkit allows Docker containers to utilize the GPU.

For Ubuntu:

# Add the package repositories
distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add -
curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | \
  sudo tee /etc/apt/sources.list.d/nvidia-docker.list

# Update the package lists
sudo apt-get update

# Install the NVIDIA Container Toolkit
sudo apt-get install -y nvidia-container-toolkit

# Restart the Docker daemon to apply changes
sudo systemctl restart docker

For other operating systems: Refer to the NVIDIA Container Toolkit installation guide for detailed instructions.

4. Verify GPU Access in Docker

To confirm that Docker can access your GPU, run the following command:

docker run --rm --gpus all nvidia/cuda:12.4.0-base-ubuntu22.04 nvidia-smi

Running Your FastAPI TTS Server with GPU Support

Assuming your FastAPI TTS application is containerized and ready to run:

  1. Build Your Docker Image

Navigate to the directory containing your Dockerfile and build the Docker image:

docker build -t your_image_name .
  1. Run the Docker Container with GPU Support

Start the container with GPU access enabled:

docker run --gpus all -p 8080:8080 -v /path/to/this/code/dir/:/app/ your_image_name API_main.py

Example API Call

import requests

# Define the base URL of your API
base_url = 'http://localhost:8080/Get_Inference'

# Set up the query parameters
params = {
    'text': 'ಮಾದರಿಯು ಸರಿಯಾಗಿ ಕಾರ್ಯನಿರ್ವಹಿಸುತ್ತಿದೆಯೇ ಎಂದು ಖಚಿತಪಡಿಸಿಕೊಳ್ಳಲು ಬಳಸಲಾಗುವ ಪರೀಕ್ಷಾ ವಾಕ್ಯ ಇದು.',
    'lang': 'kannada',
    'speaker': 'bengali_female'
}

# Send the GET request
response = requests.get(base_url, params=params)

# Check if the request was successful
if response.status_code == 200:
    # Save the audio content to a file
    with open('output.wav', 'wb') as f:
        f.write(response.content)
    print("Audio saved as 'output.wav'")
else:
    # Print the error message
    print(f"Request failed with status code {response.status_code}")
    print("Response:", response.text)
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support