Fake it to make it: Using synthetic data to remedy the data shortage in joint multimodal speech-and-gesture synthesis Paper • 2404.19622 • Published Apr 30, 2024 • 2
MM-Conv: A Multi-modal Conversational Dataset for Virtual Humans Paper • 2410.00253 • Published Sep 30, 2024
Diffusion-Based Co-Speech Gesture Generation Using Joint Text and Audio Representation Paper • 2309.05455 • Published Sep 11, 2023
Matcha-TTS: A fast TTS architecture with conditional flow matching Paper • 2309.03199 • Published Sep 6, 2023 • 11