Transfusion: Predict the Next Token and Diffuse Images with One Multi-Modal Model Paper • 2408.11039 • Published Aug 20 • 58
Media2Face: Co-speech Facial Animation Generation With Multi-Modality Guidance Paper • 2401.15687 • Published Jan 28 • 22