MuChoMusic dataset

Name: MuChoMusic dataset
Creator: CORA.Repositori de Dades de Recerca
Published: 2025-10-13 14:23:07
License: 暂无描述

DataCite Commons2025-10-13 更新2026-04-25 收录

下载链接：

https://dataverse.csuc.cat/citation?persistentId=doi:10.34810/data2642

下载链接

链接失效反馈

官方服务：

资源简介：

<h2>MuChoMusic: Evaluating Music Understanding in Multimodal Audio-Language Models</h2> <p>MuChoMusic is a benchmark designed to evaluate music understanding in multimodal language models focused on audio. It includes 1,187 multiple-choice questions validated by human annotators, based on 644 music tracks from two publicly available music datasets. These questions cover a wide variety of genres and assess knowledge and reasoning across several musical concepts and their cultural and functional contexts. The benchmark provides a holistic evaluation of five open-source models, revealing challenges such as over-reliance on the language modality and highlighting the need for better multimodal integration.</p> <h3>Note on Audio Files</h3> <p>This dataset comes without audio files. The audio files can be downloaded from two datasets: <a href="https://doi.org/10.5281/zenodo.10072001" target="_new" rel="noreferrer">SongDescriberDataset (SDD)</a> and <a href="https://www.kaggle.com/datasets/googleai/musiccaps" target="_new" rel="noreferrer">MusicCaps</a>. Please see the <a href="https://github.com/mulab-mir/muchomusic" target="_new" rel="noreferrer">code repository</a> for more information on how to download the audio.</p> <h3>Citation</h3> <p>If you use this dataset, please cite our <a href="https://arxiv.org/abs/2408.01337" target="_blank" rel="noopener">paper</a>:</p> <pre><code>@inproceedings{weck2024muchomusic, title={MuChoMusic: Evaluating Music Understanding in Multimodal Audio-Language Models}, author={Weck, Benno and Manco, Ilaria and Benetos, Emmanouil and Quinton, Elio and Fazekas, György and Bogdanov, Dmitry}, booktitle = {Proceedings of the 25th International Society for Music Information Retrieval Conference (ISMIR)}, year={2024} }</code></pre> Weck B, Manco I, Benetos E, Quinton E, Fazekas G, Bogdanov D. MuChoMusic: Evaluating Music Understanding in Multimodal Audio-Language Models. In: Kaneshiro B, Mysore G, Nieto O, Donahue C, Huang CZA, Lee JH, McFee B, McCallum M, editors. Proceedings of the 25th International Society for Music Information Retrieval Conference (ISMIR2024); 2024 November 10-14; San Francisco, USA.

提供机构：

CORA.Repositori de Dades de Recerca

创建时间：

2025-10-07

5,000+

优质数据集

54 个

任务类型

进入经典数据集