182 EMO: Emote Portrait Alive - Generating Expressive Portrait Videos with Audio2Video Diffusion Model under Weak Conditions · 4 authors 19
87 Sora: A Review on Background, Technology, Limitations, and Opportunities of Large Vision Models · 12 authors 4
21 DiffuseKronA: A Parameter Efficient Fine-tuning Method for Personalized Diffusion Model · 6 authors 1
21 OmniACT: A Dataset and Benchmark for Enabling Multimodal Generalist Autonomous Agents for Desktop and Web · 7 authors 5
15 Seeing and Hearing: Open-domain Visual-Audio Generation with Diffusion Latent Aligners · 5 authors 1
10 Playground v2.5: Three Insights towards Enhancing Aesthetic Quality in Text-to-Image Generation · 6 authors 1