five

mulab-mir/muchomusic

收藏
Hugging Face2024-08-05 更新2025-04-12 收录
下载链接:
https://hf-mirror.com/datasets/mulab-mir/muchomusic
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: cc-by-sa-4.0 language: - en tags: - music - multimodal pretty_name: MuchoMusic size_categories: - 1K<n<10K --- # MuChoMusic: Evaluating Music Understanding in Multimodal Audio-Language Models MuChoMusic is a benchmark designed to evaluate music understanding in multimodal language models focused on audio. It includes 1,187 multiple-choice questions validated by human annotators, based on 644 music tracks from two publicly available music datasets. These questions cover a wide variety of genres and assess knowledge and reasoning across several musical concepts and their cultural and functional contexts. The benchmark provides a holistic evaluation of five open-source models, revealing challenges such as over-reliance on the language modality and highlighting the need for better multimodal integration. ## Note on Audio Files This dataset comes without audio files. The audio files can be downloaded from two datasets: [SongDescriberDataset (SDD)](https://doi.org/10.5281/zenodo.10072001) and [MusicCaps](https://huggingface.co/datasets/google/MusicCaps). Please see the [code repository](https://github.com/mulab-mir/muchomusic) for more information on how to download the audio. ## Citation If you use this dataset, please cite our [paper](https://arxiv.org/abs/2408.01337): ``` @inproceedings{weck2024muchomusic, title={MuChoMusic: Evaluating Music Understanding in Multimodal Audio-Language Models}, author={Weck, Benno and Manco, Ilaria and Benetos, Emmanouil and Quinton, Elio and Fazekas, György and Bogdanov, Dmitry}, booktitle = {Proceedings of the 25th International Society for Music Information Retrieval Conference (ISMIR)}, year={2024} } ```
提供机构:
mulab-mir
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作