Generate talking face animation from still images and audio
Generate realistic audio from text
Interact with images using text prompts