ELLA: Equip Diffusion Models with LLM for Enhanced Semantic Alignment Paper • 2403.05135 • Published Mar 8 • 39
BLINK: Multimodal Large Language Models Can See but Not Perceive Paper • 2404.12390 • Published 21 days ago • 23