MusicBrainz
收藏arXiv2025-09-30 收录
下载链接:
https://dbs.uni-leipzig.de/research/projects/benchmark-datasets-for-entity-resolution
下载链接
链接失效反馈官方服务:
资源简介:
该数据集是基于MusicBrainz数据库中的歌曲记录,并使用数据生成器创建具有修改属性值的副本作为基准。通过提供的URL,可以获取所有不同大小的变体。根据60/20/20的比例实施了训练/验证/测试的划分。数据规模为2万条记录(所使用的变体),任务为实体匹配。
This dataset is built upon song records sourced from the MusicBrainz database, with a data generator utilized to create copies with modified attribute values as the benchmark dataset. All variants with different sizes are available via the provided URL. The train-validation-test split is conducted according to the 60/20/20 ratio. The dataset consists of 20,000 records (the employed variants), and the core task is entity matching.



