Spaces:
Sleeping
Sleeping
metadata
title: Project Charles
emoji: π
colorFrom: gray
colorTo: green
sdk: streamlit
python_version: 3.11.6
sdk_version: 1.26.0
app_file: app.py
pinned: true
license: mit
models:
- laion/CLIP-ViT-L-14-DataComp.XL-s13B-b90K
Project Charles
Toy app for voice based agent
Video Demo -> Early Test
Required Environment Variables/Keys
- OPENAI_API_KEY - required for ChatGPT
- ELEVENLABS_API_KEY - required for ElevenLabs TTS
Optional Environment Variables/Keys
- TWILIO_ACCOUNT_SID - reduces time for WebRTC connection
- TWILIO_AUTH_TOKEN - reduces time for WebRTC connection
How to install
pip install -r requirements.txt
Install packages from packages.txt
macOS (Homebrew)
xargs brew install < packages.txt
Linux (Ubuntu, apt)
sudo xargs -a packages.txt apt-get install -y
Linux (Fedora, dnf)
sudo xargs -a packages.txt dnf install -y
Windows (Chocolatey)
Get-Content packages.txt | ForEach-Object { choco install $_ -y }
How to run
streamlit run app.py
Known Issues
- First run maybe slow due to downloading of model. You may want to refresh the page after the first run.
- Audio errors may occur due to the way the app converts from ElevenLabs stream to WebRTC audio
- Audio error may happen if the server is running slow
- May hang and server needs a hard reset
Architecture
Key Technologies:
- Ray Actors & Queues - backbone of interprocess communication
- Streamlit - UI & WebRTC connection
- Vosk - speech to text
- ChatGPT - text to text
- ElevenLabs TTS - text to speech
- Twilio - optional faster WebRTC connection