-
HiFi Tuner: High-Fidelity Subject-Driven Fine-Tuning for Diffusion Models
Paper • 2312.00079 • Published • 14 -
VideoBooth: Diffusion-based Video Generation with Image Prompts
Paper • 2312.00777 • Published • 19 -
CoDi-2: In-Context, Interleaved, and Interactive Any-to-Any Generation
Paper • 2311.18775 • Published • 6 -
Generative Powers of Ten
Paper • 2312.02149 • Published • 4
Collections
Discover the best community collections!
Collections including paper arxiv:2402.00769
-
VideoBooth: Diffusion-based Video Generation with Image Prompts
Paper • 2312.00777 • Published • 19 -
MotionCtrl: A Unified and Flexible Motion Controller for Video Generation
Paper • 2312.03641 • Published • 19 -
GenTron: Delving Deep into Diffusion Transformers for Image and Video Generation
Paper • 2312.04557 • Published • 12 -
DreamVideo: Composing Your Dream Videos with Customized Subject and Motion
Paper • 2312.04433 • Published • 9
-
Make Pixels Dance: High-Dynamic Video Generation
Paper • 2311.10982 • Published • 65 -
Emu Video: Factorizing Text-to-Video Generation by Explicit Image Conditioning
Paper • 2311.10709 • Published • 24 -
AutoStory: Generating Diverse Storytelling Images with Minimal Human Effort
Paper • 2311.11243 • Published • 14 -
LivePhoto: Real Image Animation with Text-guided Motion Control
Paper • 2312.02928 • Published • 15
-
OmnimatteRF: Robust Omnimatte with 3D Background Modeling
Paper • 2309.07749 • Published • 6 -
AudioSR: Versatile Audio Super-resolution at Scale
Paper • 2309.07314 • Published • 23 -
Generative Image Dynamics
Paper • 2309.07906 • Published • 51 -
MagiCapture: High-Resolution Multi-Concept Portrait Customization
Paper • 2309.06895 • Published • 27
-
InstructDiffusion: A Generalist Modeling Interface for Vision Tasks
Paper • 2309.03895 • Published • 11 -
ConceptGraphs: Open-Vocabulary 3D Scene Graphs for Perception and Planning
Paper • 2309.16650 • Published • 7 -
CCEdit: Creative and Controllable Video Editing via Diffusion Models
Paper • 2309.16496 • Published • 7 -
FreeNoise: Tuning-Free Longer Video Diffusion Via Noise Rescheduling
Paper • 2310.15169 • Published • 8
-
Text2Control3D: Controllable 3D Avatar Generation in Neural Radiance Fields using Geometry-Guided Text-to-Image Diffusion Model
Paper • 2309.03550 • Published • 11 -
Memory Augmented Language Models through Mixture of Word Experts
Paper • 2311.10768 • Published • 16 -
GAIA: a benchmark for General AI Assistants
Paper • 2311.12983 • Published • 174 -
GPT4Motion: Scripting Physical Motions in Text-to-Video Generation via Blender-Oriented GPT Planning
Paper • 2311.12631 • Published • 12