-
aMUSEd: An Open MUSE Reproduction
Paper • 2401.01808 • Published • 28 -
From Audio to Photoreal Embodiment: Synthesizing Humans in Conversations
Paper • 2401.01885 • Published • 27 -
SteinDreamer: Variance Reduction for Text-to-3D Score Distillation via Stein Identity
Paper • 2401.00604 • Published • 4 -
LARP: Language-Agent Role Play for Open-World Games
Paper • 2312.17653 • Published • 30
Collections
Discover the best community collections!
Collections including paper arxiv:2311.13073
-
DreaMoving: A Human Dance Video Generation Framework based on Diffusion Models
Paper • 2312.05107 • Published • 38 -
Customizing Motion in Text-to-Video Diffusion Models
Paper • 2312.04966 • Published • 10 -
Smooth Diffusion: Crafting Smooth Latent Spaces in Diffusion Models
Paper • 2312.04410 • Published • 14 -
AnimateZero: Video Diffusion Models are Zero-Shot Image Animators
Paper • 2312.03793 • Published • 17
-
One-for-All: Generalized LoRA for Parameter-Efficient Fine-tuning
Paper • 2306.07967 • Published • 24 -
Rerender A Video: Zero-Shot Text-Guided Video-to-Video Translation
Paper • 2306.07954 • Published • 113 -
TryOnDiffusion: A Tale of Two UNets
Paper • 2306.08276 • Published • 72 -
Seeing the World through Your Eyes
Paper • 2306.09348 • Published • 32
-
FusionFrames: Efficient Architectural Aspects for Text-to-Video Generation Pipeline
Paper • 2311.13073 • Published • 56 -
MetaDreamer: Efficient Text-to-3D Creation With Disentangling Geometry and Texture
Paper • 2311.10123 • Published • 15 -
GPT4Motion: Scripting Physical Motions in Text-to-Video Generation via Blender-Oriented GPT Planning
Paper • 2311.12631 • Published • 13 -
VMC: Video Motion Customization using Temporal Attention Adaption for Text-to-Video Diffusion Models
Paper • 2312.00845 • Published • 36
-
FusionFrames: Efficient Architectural Aspects for Text-to-Video Generation Pipeline
Paper • 2311.13073 • Published • 56 -
Scaling Rectified Flow Transformers for High-Resolution Image Synthesis
Paper • 2403.03206 • Published • 56 -
Autoregressive Model Beats Diffusion: Llama for Scalable Image Generation
Paper • 2406.06525 • Published • 64
-
3D-GPT: Procedural 3D Modeling with Large Language Models
Paper • 2310.12945 • Published • 57 -
Habitat 3.0: A Co-Habitat for Humans, Avatars and Robots
Paper • 2310.13724 • Published • 8 -
TexFusion: Synthesizing 3D Textures with Text-Guided Image Diffusion Models
Paper • 2310.13772 • Published • 6 -
Reconstructive Latent-Space Neural Radiance Fields for Efficient 3D Scene Representations
Paper • 2310.17880 • Published • 7
-
Reuse and Diffuse: Iterative Denoising for Text-to-Video Generation
Paper • 2309.03549 • Published • 5 -
CCEdit: Creative and Controllable Video Editing via Diffusion Models
Paper • 2309.16496 • Published • 9 -
EvalCrafter: Benchmarking and Evaluating Large Video Generation Models
Paper • 2310.11440 • Published • 15 -
LAMP: Learn A Motion Pattern for Few-Shot-Based Video Generation
Paper • 2310.10769 • Published • 8