video - a zzfive Collection

zzfive 's Collections

3d

image

LLMs

video

agent

cv

video

updated about 7 hours ago

WorldDreamer: Towards General World Models for Video Generation via Predicting Masked Tokens

Paper • 2401.09985 • Published Jan 18 • 13
CustomVideo: Customizing Text-to-Video Generation with Multiple Subjects

Paper • 2401.09962 • Published Jan 18 • 6
Inflation with Diffusion: Efficient Temporal Adaptation for Text-to-Video Super-Resolution

Paper • 2401.10404 • Published Jan 18 • 8
ActAnywhere: Subject-Aware Video Background Generation

Paper • 2401.10822 • Published Jan 19 • 12
Lumiere: A Space-Time Diffusion Model for Video Generation

Paper • 2401.12945 • Published Jan 23 • 82
AnimateLCM: Accelerating the Animation of Personalized Diffusion Models and Adapters with Decoupled Consistency Learning

Paper • 2402.00769 • Published Feb 1 • 18
VideoPrism: A Foundational Visual Encoder for Video Understanding

Paper • 2402.13217 • Published Feb 20 • 18
Video ReCap: Recursive Captioning of Hour-Long Videos

Paper • 2402.13250 • Published Feb 20 • 19
Snap Video: Scaled Spatiotemporal Transformers for Text-to-Video Synthesis

Paper • 2402.14797 • Published Feb 22 • 18
Sora: A Review on Background, Technology, Limitations, and Opportunities of Large Vision Models

Paper • 2402.17177 • Published Feb 27 • 87
Sora Generates Videos with Stunning Geometrical Consistency

Paper • 2402.17403 • Published Feb 27 • 15
Seeing and Hearing: Open-domain Visual-Audio Generation with Diffusion Latent Aligners

Paper • 2402.17723 • Published Feb 27 • 16
Panda-70M: Captioning 70M Videos with Multiple Cross-Modality Teachers

Paper • 2402.19479 • Published Feb 29 • 30
NaturalSpeech 3: Zero-Shot Speech Synthesis with Factorized Codec and Diffusion Models

Paper • 2403.03100 • Published Mar 5 • 32
Tuning-Free Noise Rectification for High Fidelity Image-to-Video Generation

Paper • 2403.02827 • Published Mar 5 • 5
Video Editing via Factorized Diffusion Distillation

Paper • 2403.09334 • Published Mar 14 • 21
Video Mamba Suite: State Space Model as a Versatile Alternative for Video Understanding

Paper • 2403.09626 • Published Mar 14 • 11
SDEdit: Guided Image Synthesis and Editing with Stochastic Differential Equations

Paper • 2108.01073 • Published Aug 2, 2021 • 6
AnimateDiff-Lightning: Cross-Model Diffusion Distillation

Paper • 2403.12706 • Published Mar 19 • 17
Streaming Dense Video Captioning

Paper • 2404.01297 • Published Apr 1 • 10
Tango 2: Aligning Diffusion-based Text-to-Audio Generations through Direct Preference Optimization

Paper • 2404.09956 • Published Apr 15 • 10
MotionMaster: Training-free Camera Motion Transfer For Video Generation

Paper • 2404.15789 • Published Apr 24 • 10
LLM-AD: Large Language Model based Audio Description System

Paper • 2405.00983 • Published 27 days ago • 13
FIFO-Diffusion: Generating Infinite Videos from Text without Training

Paper • 2405.11473 • Published 10 days ago • 49
ReVideo: Remake a Video with Motion and Content Control

Paper • 2405.13865 • Published 7 days ago • 19
Visual Echoes: A Simple Unified Transformer for Audio-Visual Generation

Paper • 2405.14598 • Published 6 days ago • 9
Denoising LM: Pushing the Limits of Error Correction Models for Speech Recognition

Paper • 2405.15216 • Published 5 days ago • 9
I2VEdit: First-Frame-Guided Video Editing via Image-to-Video Diffusion Models

Paper • 2405.16537 • Published 3 days ago • 12
Looking Backward: Streaming Video-to-Video Translation with Feature Banks

Paper • 2405.15757 • Published 5 days ago • 9
Human4DiT: Free-view Human Video Generation with 4D Diffusion Transformer

Paper • 2405.17405 • Published 1 day ago • 7
Collaborative Video Diffusion: Consistent Multi-video Generation with Camera Control

Paper • 2405.17414 • Published 1 day ago • 5