Spaces:

sohojoe
/

project_charles

Sleeping

App Files Files Community

project_charles / README.md

sohojoe

refactor app.py to ui_app.py

f4e57e4 10 months ago

preview code

raw

history blame

No virus

1.85 kB

	---
	title: Project Charles
	emoji: 👀
	colorFrom: gray
	colorTo: green
	sdk: streamlit
	python_version: 3.11
	sdk_version: 1.26.0
	app_file: ui_app.py
	pinned: true
	license: mit
	models: ["laion/CLIP-ViT-L-14-DataComp.XL-s13B-b90K"]
	---

	# Project Charles

	Toy app for voice based agent

	Video Demo -> [Early Test](https://www.linkedin.com/posts/sohojoe_ray-vosk-chatgpt-activity-7100365711226671104-c2Nv?utm_source=share&utm_medium)

	## Required Environment Variables/Keys

	* OPENAI_API_KEY - required for ChatGPT
	* ELEVENLABS_API_KEY - required for ElevenLabs TTS

	## Optional Environment Variables/Keys

	* TWILIO_ACCOUNT_SID - reduces time for WebRTC connection
	* TWILIO_AUTH_TOKEN - reduces time for WebRTC connection

	## How to install

	```bash
	pip install -r requirements.txt
	```

	Install packages from packages.txt

	macOS (Homebrew)
	```bash
	xargs brew install < packages.txt
	```

	Linux (Ubuntu, apt)
	```bash
	sudo xargs -a packages.txt apt-get install -y
	```

	Linux (Fedora, dnf)
	```bash
	sudo xargs -a packages.txt dnf install -y
	```

	Windows (Chocolatey)
	```bash
	Get-Content packages.txt \| ForEach-Object { choco install $_ -y }
	```



	## How to run

	```bash
	streamlit run ui_app.py
	```

	## Known Issues

	* First run maybe slow due to downloading of model. You may want to refresh the page after the first run.
	* Audio errors may occur due to the way the app converts from ElevenLabs stream to WebRTC audio
	* Audio error may happen if the server is running slow
	* May hang and server needs a hard reset

	## Architecture

	![Image of the architecture](./images/ProjectCharlesCommunicationArchitecture.jpg)

	Key Technologies:

	* Ray Actors & Queues - backbone of interprocess communication
	* Streamlit - UI & WebRTC connection
	* Vosk - speech to text
	* ChatGPT - text to text
	* ElevenLabs TTS - text to speech
	* Twilio - optional faster WebRTC connection