five

Sas: Semantic Artist Similarity Dataset

收藏
Zenodo2020-09-20 更新2026-05-25 收录
下载链接:
https://zenodo.org/record/1291809
下载链接
链接失效反馈
官方服务:
资源简介:
The Semantic Artist Similarity dataset consists of two datasets of artists entities with their corresponding biography texts, and the list of top-10 most similar artists within the datasets used as ground truth. The dataset is composed by a corpus of 268 artists and a slightly larger one of 2,336 artists, both gathered from Last.fm in March 2015. The former is mapped to the MIREX Audio and Music Similarity evaluation dataset, so that its similarity judgments can be used as ground truth. For the latter corpus we use the similarity between artists as provided by the Last.fm API. For every artist there is a list with the top-10 most related artists. In the MIREX dataset there are 188 artists with at least 10 similar artists, the other 80 artists have less than 10 similar artists. In the Last.fm API dataset all artists have a list of 10 similar artists. There are 4 files in the dataset. <strong>mirex_gold_top10.txt</strong> and <strong>lastfmapi_gold_top10.txt</strong> have the top-10 lists of artists for every artist of both datasets. Artists are identified by MusicBrainz ID. The format of the file is one line per artist, with the artist mbid separated by a tab with the list of top-10 related artists identified by their mbid separated by spaces. artist_mbid \t artist_mbid_top10_list_separated_by_spaces \n <strong>mb2uri_mirex</strong> and <strong>mb2uri_lastfmapi.txt</strong> have the list of artists. In each line there are three fields separated by tabs. First field is the MusicBrainz ID, second field is the last.fm name of the artist, and third field is the DBpedia uri. artist_mbid \t lastfm_name \t dbpedia_uri \n There are also 2 folders in the dataset with the biography texts of each dataset. Each .txt file in the biography folders is named with the MusicBrainz ID of the biographied artist. Biographies were gathered from the Last.fm wiki page of every artist. <strong>Using this dataset</strong> We would highly appreciate if scientific publications of works partly based on the Semantic Artist Similarity dataset quote the following publication: Oramas, S., Sordo M., Espinosa-Anke L., &amp; Serra X. (In Press). A Semantic-based Approach for Artist Similarity. 16th International Society for Music Information Retrieval Conference. We are interested in knowing if you find our datasets useful! If you use our dataset please email us at mtg-info@upf.edu and tell us about your research. https://www.upf.edu/web/mtg/semantic-similarity
提供机构:
Zenodo
创建时间:
2018-06-29
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作