VividFace: A Diffusion-Based Hybrid Framework for High-Fidelity Video Face Swapping Paper • 2412.11279 • Published 10 days ago • 12
EasyRef: Omni-Generalized Group Image Reference for Diffusion Models via Multimodal LLM Paper • 2412.09618 • Published 13 days ago • 21
Exploring the Role of Large Language Models in Prompt Encoding for Diffusion Models Paper • 2406.11831 • Published Jun 17 • 21
VisCoT Collection Visual CoT: Unleashing Chain-of-Thought Reasoning in the Multi-Modal Language Model • 5 items • Updated Jun 13 • 2