ChatMusician: Understanding and Generating Music Intrinsically with LLM Paper • 2402.16153 • Published Feb 25, 2024 • 60
Seeing and Hearing: Open-domain Visual-Audio Generation with Diffusion Latent Aligners Paper • 2402.17723 • Published Feb 27, 2024 • 16
ComposerX: Multi-Agent Symbolic Music Composition with LLMs Paper • 2404.18081 • Published Apr 28, 2024 • 2
MMTrail: A Multimodal Trailer Video Dataset with Language and Music Descriptions Paper • 2407.20962 • Published Jul 30, 2024
VidMuse: A Simple Video-to-Music Generation Framework with Long-Short-Term Modeling Paper • 2406.04321 • Published Jun 6, 2024 • 1
YuE: Scaling Open Foundation Models for Long-Form Music Generation Paper • 2503.08638 • Published 20 days ago • 60
AudioX: Diffusion Transformer for Anything-to-Audio Generation Paper • 2503.10522 • Published 18 days ago • 21
Spark-TTS: An Efficient LLM-Based Text-to-Speech Model with Single-Stream Decoupled Speech Tokens Paper • 2503.01710 • Published 28 days ago • 5
Spark-TTS: An Efficient LLM-Based Text-to-Speech Model with Single-Stream Decoupled Speech Tokens Paper • 2503.01710 • Published 28 days ago • 5
Chinese Open Instruction Generalist: A Preliminary Release Paper • 2304.07987 • Published Apr 17, 2023 • 2
MMMU: A Massive Multi-discipline Multimodal Understanding and Reasoning Benchmark for Expert AGI Paper • 2311.16502 • Published Nov 27, 2023 • 35
LyricWhiz: Robust Multilingual Zero-shot Lyrics Transcription by Whispering to ChatGPT Paper • 2306.17103 • Published Jun 29, 2023 • 1
CIF-Bench: A Chinese Instruction-Following Benchmark for Evaluating the Generalizability of Large Language Models Paper • 2402.13109 • Published Feb 20, 2024
COIG-CQIA: Quality is All You Need for Chinese Instruction Fine-tuning Paper • 2403.18058 • Published Mar 26, 2024 • 4
The Fine Line: Navigating Large Language Model Pretraining with Down-streaming Capability Analysis Paper • 2404.01204 • Published Apr 1, 2024
Chinese Tiny LLM: Pretraining a Chinese-Centric Large Language Model Paper • 2404.04167 • Published Apr 5, 2024 • 14
MuPT: A Generative Symbolic Music Pretrained Transformer Paper • 2404.06393 • Published Apr 9, 2024 • 16