Mega-TTS 2: Zero-Shot Text-to-Speech with Arbitrary Length Speech Prompts Paper • 2307.07218 • Published Jul 14, 2023 • 26
Macaw-LLM: Multi-Modal Language Modeling with Image, Audio, Video, and Text Integration Paper • 2306.09093 • Published Jun 15, 2023 • 15