-
ChatMusician: Understanding and Generating Music Intrinsically with LLM
Paper • 2402.16153 • Published • 55 -
GRM: Large Gaussian Reconstruction Model for Efficient 3D Reconstruction and Generation
Paper • 2403.14621 • Published • 14 -
Garment3DGen: 3D Garment Stylization and Texture Generation
Paper • 2403.18816 • Published • 19
Collections
Discover the best community collections!
Collections including paper arxiv:2402.16153
-
AnyGPT: Unified Multimodal LLM with Discrete Sequence Modeling
Paper • 2402.12226 • Published • 37 -
M2-CLIP: A Multimodal, Multi-task Adapting Framework for Video Action Recognition
Paper • 2401.11649 • Published • 3 -
Gen4Gen: Generative Data Pipeline for Generative Multi-Concept Composition
Paper • 2402.15504 • Published • 19 -
EMO: Emote Portrait Alive - Generating Expressive Portrait Videos with Audio2Video Diffusion Model under Weak Conditions
Paper • 2402.17485 • Published • 183
-
A Novel 1D State Space for Efficient Music Rhythmic Analysis
Paper • 2111.00704 • Published -
Amphion: An Open-Source Audio, Music and Speech Generation Toolkit
Paper • 2312.09911 • Published • 52 -
Music Style Transfer with Time-Varying Inversion of Diffusion Models
Paper • 2402.13763 • Published • 9 -
ChatMusician: Understanding and Generating Music Intrinsically with LLM
Paper • 2402.16153 • Published • 55
-
MusicMagus: Zero-Shot Text-to-Music Editing via Diffusion Models
Paper • 2402.06178 • Published • 12 -
DITTO: Diffusion Inference-Time T-Optimization for Music Generation
Paper • 2401.12179 • Published • 18 -
Fast Timing-Conditioned Latent Audio Diffusion
Paper • 2402.04825 • Published • 7 -
Brain2Music: Reconstructing Music from Human Brain Activity
Paper • 2307.11078 • Published • 39