-
Media2Face: Co-speech Facial Animation Generation With Multi-Modality Guidance
Paper ā¢ 2401.15687 ā¢ Published ā¢ 19 -
Gaussian Head Avatar: Ultra High-fidelity Head Avatar via Dynamic Gaussians
Paper ā¢ 2312.03029 ā¢ Published ā¢ 22 -
DREAM-Talk: Diffusion-based Realistic Emotional Audio-driven Method for Single Image Talking Face Generation
Paper ā¢ 2312.13578 ā¢ Published ā¢ 23 -
Splatter Image: Ultra-Fast Single-View 3D Reconstruction
Paper ā¢ 2312.13150 ā¢ Published ā¢ 13
Collections
Discover the best community collections!
Collections including paper arxiv:2402.17177
-
WorldDreamer: Towards General World Models for Video Generation via Predicting Masked Tokens
Paper ā¢ 2401.09985 ā¢ Published ā¢ 13 -
CustomVideo: Customizing Text-to-Video Generation with Multiple Subjects
Paper ā¢ 2401.09962 ā¢ Published ā¢ 6 -
Inflation with Diffusion: Efficient Temporal Adaptation for Text-to-Video Super-Resolution
Paper ā¢ 2401.10404 ā¢ Published ā¢ 8 -
ActAnywhere: Subject-Aware Video Background Generation
Paper ā¢ 2401.10822 ā¢ Published ā¢ 12
-
DocLLM: A layout-aware generative language model for multimodal document understanding
Paper ā¢ 2401.00908 ā¢ Published ā¢ 174 -
Lightning Attention-2: A Free Lunch for Handling Unlimited Sequence Lengths in Large Language Models
Paper ā¢ 2401.04658 ā¢ Published ā¢ 24 -
Weaver: Foundation Models for Creative Writing
Paper ā¢ 2401.17268 ā¢ Published ā¢ 39 -
Efficient Tool Use with Chain-of-Abstraction Reasoning
Paper ā¢ 2401.17464 ā¢ Published ā¢ 15
-
PIA: Your Personalized Image Animator via Plug-and-Play Modules in Text-to-Image Models
Paper ā¢ 2312.13964 ā¢ Published ā¢ 16 -
LLM in a flash: Efficient Large Language Model Inference with Limited Memory
Paper ā¢ 2312.11514 ā¢ Published ā¢ 253 -
StreamDiffusion: A Pipeline-level Solution for Real-time Interactive Generation
Paper ā¢ 2312.12491 ā¢ Published ā¢ 66 -
LLaVA-Ļ: Efficient Multi-Modal Assistant with Small Language Model
Paper ā¢ 2401.02330 ā¢ Published ā¢ 11
-
VideoSwap: Customized Video Subject Swapping with Interactive Semantic Point Correspondence
Paper ā¢ 2312.02087 ā¢ Published ā¢ 20 -
FaceStudio: Put Your Face Everywhere in Seconds
Paper ā¢ 2312.02663 ā¢ Published ā¢ 28 -
Orthogonal Adaptation for Modular Customization of Diffusion Models
Paper ā¢ 2312.02432 ā¢ Published ā¢ 12 -
ReconFusion: 3D Reconstruction with Diffusion Priors
Paper ā¢ 2312.02981 ā¢ Published ā¢ 8
-
One-for-All: Generalized LoRA for Parameter-Efficient Fine-tuning
Paper ā¢ 2306.07967 ā¢ Published ā¢ 23 -
Rerender A Video: Zero-Shot Text-Guided Video-to-Video Translation
Paper ā¢ 2306.07954 ā¢ Published ā¢ 111 -
TryOnDiffusion: A Tale of Two UNets
Paper ā¢ 2306.08276 ā¢ Published ā¢ 71 -
Seeing the World through Your Eyes
Paper ā¢ 2306.09348 ā¢ Published ā¢ 30
-
Can LLMs Follow Simple Rules?
Paper ā¢ 2311.04235 ā¢ Published ā¢ 9 -
The Unreasonable Ineffectiveness of the Deeper Layers
Paper ā¢ 2403.17887 ā¢ Published ā¢ 75 -
GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection
Paper ā¢ 2403.03507 ā¢ Published ā¢ 176 -
Sora: A Review on Background, Technology, Limitations, and Opportunities of Large Vision Models
Paper ā¢ 2402.17177 ā¢ Published ā¢ 87
-
Attention Is All You Need
Paper ā¢ 1706.03762 ā¢ Published ā¢ 36 -
FlashAttention-2: Faster Attention with Better Parallelism and Work Partitioning
Paper ā¢ 2307.08691 ā¢ Published ā¢ 6 -
Mixtral of Experts
Paper ā¢ 2401.04088 ā¢ Published ā¢ 154 -
Mistral 7B
Paper ā¢ 2310.06825 ā¢ Published ā¢ 43
-
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits
Paper ā¢ 2402.17764 ā¢ Published ā¢ 568 -
Mixtral of Experts
Paper ā¢ 2401.04088 ā¢ Published ā¢ 154 -
Mistral 7B
Paper ā¢ 2310.06825 ā¢ Published ā¢ 43 -
Don't Make Your LLM an Evaluation Benchmark Cheater
Paper ā¢ 2311.01964 ā¢ Published ā¢ 1