--- title: Project Charles emoji: 👀 colorFrom: gray colorTo: green sdk: streamlit python_version: 3.11 sdk_version: 1.26.0 app_file: ui_app.py pinned: true license: mit models: ["laion/CLIP-ViT-L-14-DataComp.XL-s13B-b90K"] --- # Project Charles Toy app for voice based agent Video Demo -> [Early Test](https://www.linkedin.com/posts/sohojoe_ray-vosk-chatgpt-activity-7100365711226671104-c2Nv?utm_source=share&utm_medium) ## Required Environment Variables/Keys * OPENAI_API_KEY - required for ChatGPT * ELEVENLABS_API_KEY - required for ElevenLabs TTS ## Optional Environment Variables/Keys * TWILIO_ACCOUNT_SID - reduces time for WebRTC connection * TWILIO_AUTH_TOKEN - reduces time for WebRTC connection ## How to install ```bash pip install -r requirements.txt ``` Install packages from packages.txt macOS (Homebrew) ```bash xargs brew install < packages.txt ``` Linux (Ubuntu, apt) ```bash sudo xargs -a packages.txt apt-get install -y ``` Linux (Fedora, dnf) ```bash sudo xargs -a packages.txt dnf install -y ``` Windows (Chocolatey) ```bash Get-Content packages.txt | ForEach-Object { choco install $_ -y } ``` ## How to run ```bash streamlit run ui_app.py ``` ## Known Issues * First run maybe slow due to downloading of model. You may want to refresh the page after the first run. * Audio errors may occur due to the way the app converts from ElevenLabs stream to WebRTC audio * Audio error may happen if the server is running slow * May hang and server needs a hard reset ## Architecture ![Image of the architecture](./images/ProjectCharlesCommunicationArchitecture.jpg) Key Technologies: * Ray Actors & Queues - backbone of interprocess communication * Streamlit - UI & WebRTC connection * Vosk - speech to text * ChatGPT - text to text * ElevenLabs TTS - text to speech * Twilio - optional faster WebRTC connection