DiTCtrl: Exploring Attention Control in Multi-Modal Diffusion Transformer for Tuning-Free Multi-Prompt Longer Video Generation Paper • 2412.18597 • Published 10 days ago • 19
SPAE: Semantic Pyramid AutoEncoder for Multimodal Generation with Frozen LLMs Paper • 2306.17842 • Published Jun 30, 2023 • 9