-
ShareGPT4Video: Improving Video Understanding and Generation with Better Captions
Paper • 2406.04325 • Published • 71 -
SF-V: Single Forward Video Generation Model
Paper • 2406.04324 • Published • 23 -
VideoTetris: Towards Compositional Text-to-Video Generation
Paper • 2406.04277 • Published • 22 -
Vript: A Video Is Worth Thousands of Words
Paper • 2406.06040 • Published • 22
Collections
Discover the best community collections!
Collections including paper arxiv:2406.04324
-
MOFA-Video: Controllable Image Animation via Generative Motion Field Adaptions in Frozen Image-to-Video Diffusion Model
Paper • 2405.20222 • Published • 10 -
ZeroSmooth: Training-free Diffuser Adaptation for High Frame Rate Video Generation
Paper • 2406.00908 • Published • 11 -
CamCo: Camera-Controllable 3D-Consistent Image-to-Video Generation
Paper • 2406.02509 • Published • 8 -
I4VGen: Image as Stepping Stone for Text-to-Video Generation
Paper • 2406.02230 • Published • 15
-
Video as the New Language for Real-World Decision Making
Paper • 2402.17139 • Published • 18 -
Learning and Leveraging World Models in Visual Representation Learning
Paper • 2403.00504 • Published • 31 -
MovieLLM: Enhancing Long Video Understanding with AI-Generated Movies
Paper • 2403.01422 • Published • 26 -
VideoElevator: Elevating Video Generation Quality with Versatile Text-to-Image Diffusion Models
Paper • 2403.05438 • Published • 18
-
WorldDreamer: Towards General World Models for Video Generation via Predicting Masked Tokens
Paper • 2401.09985 • Published • 14 -
CustomVideo: Customizing Text-to-Video Generation with Multiple Subjects
Paper • 2401.09962 • Published • 7 -
Inflation with Diffusion: Efficient Temporal Adaptation for Text-to-Video Super-Resolution
Paper • 2401.10404 • Published • 10 -
ActAnywhere: Subject-Aware Video Background Generation
Paper • 2401.10822 • Published • 13
-
One-for-All: Generalized LoRA for Parameter-Efficient Fine-tuning
Paper • 2306.07967 • Published • 24 -
Rerender A Video: Zero-Shot Text-Guided Video-to-Video Translation
Paper • 2306.07954 • Published • 113 -
TryOnDiffusion: A Tale of Two UNets
Paper • 2306.08276 • Published • 72 -
Seeing the World through Your Eyes
Paper • 2306.09348 • Published • 32
-
FreeU: Free Lunch in Diffusion U-Net
Paper • 2309.11497 • Published • 64 -
Concept Sliders: LoRA Adaptors for Precise Control in Diffusion Models
Paper • 2311.12092 • Published • 21 -
ZipLoRA: Any Subject in Any Style by Effectively Merging LoRAs
Paper • 2311.13600 • Published • 42 -
PALP: Prompt Aligned Personalization of Text-to-Image Models
Paper • 2401.06105 • Published • 46
-
HyperHuman: Hyper-Realistic Human Generation with Latent Structural Diffusion
Paper • 2310.08579 • Published • 14 -
MotionDirector: Motion Customization of Text-to-Video Diffusion Models
Paper • 2310.08465 • Published • 14 -
DPM-Solver-v3: Improved Diffusion ODE Solver with Empirical Model Statistics
Paper • 2310.13268 • Published • 17 -
VideoCrafter1: Open Diffusion Models for High-Quality Video Generation
Paper • 2310.19512 • Published • 15