Running on Zero 48 48 IndexTTS: An Industrial-Level Controllable and Efficient Zero-Shot Text-To-Speech System 🎙 Generate speech from text using reference audio
microsoft/Phi-4-multimodal-instruct Automatic Speech Recognition • Updated 15 days ago • 622k • 1.32k