Running 300 300 Qwen2.5 Omni 7B Demo 🏆 Generate text and speech responses from text, images, or audio input
microsoft/Phi-4-multimodal-instruct Automatic Speech Recognition • Updated 16 days ago • 309k • 1.39k