Audio-visual Controlled Video Diffusion with Masked Selective State Spaces Modeling for Natural Talking Head Generation Paper • 2504.02542 • Published 3 days ago • 17
TANGO: Co-Speech Gesture Video Reenactment with Hierarchical Audio Motion Embedding and Diffusion Interpolation Paper • 2410.04221 • Published Oct 5, 2024 • 3
EMAGE: Towards Unified Holistic Co-Speech Gesture Generation via Expressive Masked Audio Gesture Modeling Paper • 2401.00374 • Published Dec 31, 2023 • 1