Submit media inputs to generate text and speech responses
Send text and get detailed responses
QwQ-32B-Preview