AI & ML interests

MIR, Music AI

Organization Card
About org cards


Data is crucial in various computer-related fields, including Music Information Retrieval (MIR), an interdisciplinary area bridging computer science and music. This paper introduces CCMusic, an open and diverse database comprising numerous datasets for different MIR tasks, all of which are publicly available. Notably, CCMusic includes several datasets tailored for Chinese music, an underrepresented domain in the MIR community. Our database integrated both published and unpublished (unevaluated) datasets; for the former, we contribute by integrating them into our database, while for the latter, we provide comprehensive evaluations to ensure data reliability. The raw data of this database comes from ccmusic-database.


  author       = {Monan Zhou, Shenyang Xu, Zhaorui Liu, Zhaowen Wang, Feng Yu, Wei Li and Baoqiang Han},
  title        = {CCMusic: an Open and Diverse Database for Chinese and General Music Information Retrieval Research},
  month        = {mar},
  year         = {2024},
  publisher    = {HuggingFace},
  version      = {1.2},
  url          = {}


We thank Yuan Wang for contributing computational resources to part of the evaluation, Shaohua Ji for contributing to the expansion of the Piano Sound Quality Dataset through recordings. We also thank to Dichucheng Li and Yulun Wu for their effective communication regarding the alignment of details in the guzheng-related datasets.