Upload an image, detect objects, hear descriptions
Translate English text into multiple languages
Extract and summarize YouTube video transcripts