Collections
Discover the best community collections!
Collections including paper arxiv:2402.16153
-
CoMoSpeech: One-Step Speech and Singing Voice Synthesis via Consistency Model
Paper • 2305.06908 • Published • 4 -
CoMoSVC: Consistency Model-based Singing Voice Conversion
Paper • 2401.01792 • Published • 7 -
ChatMusician: Understanding and Generating Music Intrinsically with LLM
Paper • 2402.16153 • Published • 55 -
FlashSpeech: Efficient Zero-Shot Speech Synthesis
Paper • 2404.14700 • Published • 28
-
MERT: Acoustic Music Understanding Model with Large-Scale Self-supervised Training
Paper • 2306.00107 • Published • 2 -
MusiLingo: Bridging Music and Text with Pre-trained Language Models for Music Captioning and Query Response
Paper • 2309.08730 • Published • 1 -
ChatMusician: Understanding and Generating Music Intrinsically with LLM
Paper • 2402.16153 • Published • 55 -
CMMMU: A Chinese Massive Multi-discipline Multimodal Understanding Benchmark
Paper • 2401.11944 • Published • 24
-
ChatMusician: Understanding and Generating Music Intrinsically with LLM
Paper • 2402.16153 • Published • 55 -
GRM: Large Gaussian Reconstruction Model for Efficient 3D Reconstruction and Generation
Paper • 2403.14621 • Published • 14 -
Garment3DGen: 3D Garment Stylization and Texture Generation
Paper • 2403.18816 • Published • 19