Convert images of screens to structured elements
Transcribe audio with emotions and events
Upload documents for Q&A with Qwen-Turbo