SpeechGuard: Exploring the Adversarial Robustness of Multimodal Large Language Models Paper • 2405.08317 • Published May 14 • 9
Zipper: A Multi-Tower Decoder Architecture for Fusing Modalities Paper • 2405.18669 • Published May 29 • 11
Seed-TTS: A Family of High-Quality Versatile Speech Generation Models Paper • 2406.02430 • Published Jun 4 • 28
WavTokenizer: an Efficient Acoustic Discrete Codec Tokenizer for Audio Language Modeling Paper • 2408.16532 • Published 13 days ago • 44