Generate speech from text in multiple voices
Run MiniCPM model via a Flask API
Generate a video from a text prompt
Generate wordβbyβword subtitles and burn them into video