tts-openai / README.md
matdmiller's picture
updated readme
e8a8dd0

A newer version of the Gradio SDK is available: 4.37.1

Upgrade
metadata
title: TTS
emoji: πŸ”₯
colorFrom: green
colorTo: red
sdk: gradio
sdk_version: 4.36.1
app_file: app.py
pinned: false
license: mit
hf_oauth: true
hf_oauth_expiration_minutes: 480

Getting Started

This spaces allows you to create speech audio from text. It currently works with OpenAI and Cartesia AI. Why use this instead of the playgrounds from the respective services? Because you can convert text longer than the maximum context allowed via a single API call. This space chunks the text if it is longer than the maximum allowed context and submits requests to the API in parallel to have them converted and then stitched back together into a single MP3.

Running Locally

git clone https://huggingface.co/spaces/matdmiller/tts-openai
cd tts-openai
pip install -r requirements.txt

Next create a file tts_openai_secrets.py to hold the secrets or set them up as environment variables yourself.

import os
os.environ['OPENAI_API_KEY'] = 'sk-xxx'
os.environ["CARTESIA_API_KEY"]= 'xxx'
os.environ["ALLOWED_OAUTH_PROFILE_USERNAMES"]= '<your huggingface username>' #comma delimited
os.environ["REQUIRE_AUTH"]= 'false'

This file is included in the .gitignore so it will not be included when you publish to your repo. This is very important because the πŸ€— Spaces repos are public by default and you don't want to leak your API keys.

Finally to run locally simply run:

python app.py

Cloning to run in your own πŸ€— Space.

Click the ... in the top right hand corder of my Space and click Duplicate Space.

A photo showing how to duplicate a Hugging Face Space.

This will bring up a form that you need to fill out with the appropriate keys and settings.

A photo showing the form for duplicating a Hugging Face Space.

  • Owner: Your HF Username
  • Visibility: Public (Or Private if you know what you're doing). By default this app is secured with πŸ€— OAuth and only allows usernames you specify in the ALLOWED_OAUTH_PROFILE_USERNAMES property.
  • Space Hardware: Free CPU Basic is plenty of horsepower for this app becuase all of the TTS work is happening behind the API.
  • OPENAI_API_KEY: Input your own OpenAI API key which you can get by signing up for an account on https://platform.openai.com. This is completely separate from ChatGPT.
  • CARTESIA_API_KEY: Input your own Cartesia AI API key which you can obtain from https://play.cartesia.ai/subscription
  • ALLOWED_OAUTH_PROFILE_USERNAMES: Add a comma separated list of Hugging Face usernames that are allowed to use the space. Only usernames listed here will be allowed generate speech audio, though anyone can view the site. Ex: santa1225,clem2024. If you have a single username, don't add a comma.

Click Duplicate Space. Your new space should be created within a couple of minutes.

TIP: As of Gradio Version 4.36.1 the Hugging Face Login button is broken when viewing the space in an iframe which is the default view of a space. To use the space you need to navigate to the fully qualified name directly which is typically https://<your hf username>-<your hf space name>.hf.space. For example mine is https://matdmiller-tts-openai.hf.space/.

Additional HF Spaced Config Info: https://huggingface.co/docs/hub/spaces-config-reference