Collections
Discover the best community collections!
Collections including paper arxiv:2402.17485
-
MegaScale: Scaling Large Language Model Training to More Than 10,000 GPUs
Paper • 2402.15627 • Published • 32 -
Rainbow Teaming: Open-Ended Generation of Diverse Adversarial Prompts
Paper • 2402.16822 • Published • 15 -
FuseChat: Knowledge Fusion of Chat Models
Paper • 2402.16107 • Published • 36 -
Multi-LoRA Composition for Image Generation
Paper • 2402.16843 • Published • 28
-
VideoPrism: A Foundational Visual Encoder for Video Understanding
Paper • 2402.13217 • Published • 19 -
EMO: Emote Portrait Alive - Generating Expressive Portrait Videos with Audio2Video Diffusion Model under Weak Conditions
Paper • 2402.17485 • Published • 184 -
Qwen/Qwen-VL-Chat
Text Generation • Updated • 2.33M • 292 -
MovieLLM: Enhancing Long Video Understanding with AI-Generated Movies
Paper • 2403.01422 • Published • 24
-
AnyGPT: Unified Multimodal LLM with Discrete Sequence Modeling
Paper • 2402.12226 • Published • 37 -
M2-CLIP: A Multimodal, Multi-task Adapting Framework for Video Action Recognition
Paper • 2401.11649 • Published • 3 -
Gen4Gen: Generative Data Pipeline for Generative Multi-Concept Composition
Paper • 2402.15504 • Published • 20 -
EMO: Emote Portrait Alive - Generating Expressive Portrait Videos with Audio2Video Diffusion Model under Weak Conditions
Paper • 2402.17485 • Published • 184
-
metavoiceio/metavoice-1B-v0.1
Text-to-Speech • Updated • 1.2k • 728 -
BASE TTS: Lessons from building a billion-parameter Text-to-Speech model on 100K hours of data
Paper • 2402.08093 • Published • 52 -
EMO: Emote Portrait Alive - Generating Expressive Portrait Videos with Audio2Video Diffusion Model under Weak Conditions
Paper • 2402.17485 • Published • 184
-
Advances in 3D Generation: A Survey
Paper • 2401.17807 • Published • 16 -
IM-3D: Iterative Multiview Diffusion and Reconstruction for High-Quality 3D Generation
Paper • 2402.08682 • Published • 12 -
LGM: Large Multi-View Gaussian Model for High-Resolution 3D Content Creation
Paper • 2402.05054 • Published • 24 -
GaussianObject: Just Taking Four Images to Get A High-Quality 3D Object with Gaussian Splatting
Paper • 2402.10259 • Published • 13
-
Motion-I2V: Consistent and Controllable Image-to-Video Generation with Explicit Motion Modeling
Paper • 2401.15977 • Published • 35 -
Lumiere: A Space-Time Diffusion Model for Video Generation
Paper • 2401.12945 • Published • 85 -
AnimateDiff: Animate Your Personalized Text-to-Image Diffusion Models without Specific Tuning
Paper • 2307.04725 • Published • 63 -
Boximator: Generating Rich and Controllable Motions for Video Synthesis
Paper • 2402.01566 • Published • 26
-
Media2Face: Co-speech Facial Animation Generation With Multi-Modality Guidance
Paper • 2401.15687 • Published • 20 -
Gaussian Head Avatar: Ultra High-fidelity Head Avatar via Dynamic Gaussians
Paper • 2312.03029 • Published • 22 -
DREAM-Talk: Diffusion-based Realistic Emotional Audio-driven Method for Single Image Talking Face Generation
Paper • 2312.13578 • Published • 23 -
Splatter Image: Ultra-Fast Single-View 3D Reconstruction
Paper • 2312.13150 • Published • 13
-
Compose and Conquer: Diffusion-Based 3D Depth Aware Composable Image Synthesis
Paper • 2401.09048 • Published • 7 -
Improving fine-grained understanding in image-text pre-training
Paper • 2401.09865 • Published • 13 -
Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data
Paper • 2401.10891 • Published • 54 -
Scaling Up to Excellence: Practicing Model Scaling for Photo-Realistic Image Restoration In the Wild
Paper • 2401.13627 • Published • 70