Transcribe audio from microphone, file, or YouTube link
Generate images from text prompts and reference images