five

Sound and music recommendation with knowledge graphs [dataset]

收藏
DataCite Commons2025-06-10 更新2025-04-09 收录
下载链接:
https://dataverse.csuc.cat/citation?persistentId=doi:10.34810/data444
下载链接
链接失效反馈
官方服务:
资源简介:
Music Recommendation Dataset (KGRec-music). Number of items: 8,640. Number of users: 5,199. Number of items-users interactions: 751,531. All the data comes from songfacts.com and last.fm websites. Items are songs, which are described in terms of textual description extracted from songfacts.com, and tags from last.fm. Files and folders in the dataset: /descriptions: In this folder there is one file per item with the textual description of the item. The name of the file is the id of the item plus the ".txt" extension. /tags: In this folder there is one file per item with the tags of the item separated by spaces. Multiword tags are separated by -. The name of the file is the id of the item plus the ".txt" extension. Not all items have tags, there are 401 items without tags. implicit_lf_dataset.txt: This file contains the interactions between users and items. There is one line per interaction (a user that downloaded a sound in this case) with the following format, fields in one line are separated by tabs: user_id /t sound_id /t 1 /n. Sound Recommendation Dataset (KGRec-sound). Number of items: 21,552. Number of users: 20,000. Number of items-users interactions: 2,117,698. All the data comes from Freesound.org. Items are sounds, which are described in terms of textual description and tags created by the sound creator at uploading time. Files and folders in the dataset: /descriptions: In this folder there is one file per item with the textual description of the item. The name of the file is the id of the item plus the ".txt" extension. /tags: In this folder there is one file per item with the tags of the item separated by spaces. The name of the file is the id of the item plus the ".txt" extension. downloads_fs_dataset.txt: This file contains the interactions between users and items. There is one line per interaction (a user that downloaded a sound in this case) with the following format, fields in one line are separated by tabs: /nuser_id /t sound_id /t 1 /n. Two different datasets with users, items, implicit feedback interactions between users and items, item tags, and item text descriptions are provided, one for Music Recommendation (KGRec-music), and other for Sound Recommendation (KGRec-sound).

音乐推荐数据集(KGRec-music)。该数据集包含8640个物品、5199位用户以及751531条用户-物品交互记录。所有数据均源自songfacts.com与last.fm网站。其中物品为歌曲,其信息包含从songfacts.com提取的文本描述,以及来自last.fm的标签。 数据集内的文件与文件夹说明如下: /descriptions:该文件夹下为每个物品对应的文本描述文件,文件名格式为“物品ID+.txt”。 /tags:该文件夹下为每个物品对应的标签文件,标签以空格分隔,多词标签使用连字符“-”连接。并非所有物品均带有标签,共计401个物品无对应标签。文件名格式同样为“物品ID+.txt”。 implicit_lf_dataset.txt:该文件存储用户与物品的交互记录,每行对应一条交互(此处为用户下载音频的行为),字段间以制表符分隔,单行格式为:user_id sound_id 1 。 声音推荐数据集(KGRec-sound)。该数据集包含21552个物品、20000位用户以及2117698条用户-物品交互记录。所有数据均源自Freesound.org网站。其中物品为音频片段,其信息包含上传时创作者提供的文本描述与标签。 数据集内的文件与文件夹说明如下: /descriptions:该文件夹下为每个物品对应的文本描述文件,文件名格式为“物品ID+.txt”。 /tags:该文件夹下为每个物品对应的标签文件,标签以空格分隔,文件名格式为“物品ID+.txt”。 downloads_fs_dataset.txt:该文件存储用户与物品的交互记录,每行对应一条交互(此处为用户下载音频的行为),字段间以制表符分隔,单行格式为:user_id sound_id 1 。 本次共提供两个数据集,分别面向音乐推荐与声音推荐任务,均包含用户、物品、用户与物品间的隐式反馈交互、物品标签及物品文本描述四类信息。
提供机构:
CORA.Repositori de Dades de Recerca
创建时间:
2022-10-11
搜集汇总
背景与挑战
背景概述
该数据集包含音乐和声音两个推荐子集,分别提供用户-物品交互数据、物品文本描述和标签,适用于基于知识图谱的推荐系统研究。
以上内容由遇见数据集搜集并总结生成
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作