Convert spoken words to text
Generate realistic images from textual descriptions
Chat with Qwen2-72B-instruct using a system prompt