Head-100, Median-100, Tail-100, Mix-100 samples of Amazon Reviews 5-core (2014)
收藏NIAID Data Ecosystem2026-03-12 收录
下载链接:
https://doi.org/10.7910/DVN/V7X3VE
下载链接
链接失效反馈官方服务:
资源简介:
Subsamples of the Amazon Reviews 5-core dataset (2014) (see https://jmcauley.ucsd.edu/data/amazon). Subsamples constitute entries holding for instance a rating and/or review on an amazon product for 100 users. The Head-100 data set holds entries by the 100 head users, that is the users with the highest number of entries. The Median-100 data set holds entries from 100 median users, that is users whose number of entries is around the median. The Tail-100 data set holds entries from 100 tail users, that is user with the least number of entries. Finally the Mix-100 data set holds entries by a sample of the Head-100, Median-100, and Tail-100 data sets at a ratio of 33 to 34 to 33. The data is structured into four folders representing the four review text samples, four .csv files representing the respective user-item matrices, and a jupyter notebook that allows to reproduce sampling. The data sets are not subsampled for training and test sets. Please use the jupyter notebook in order to sample training and test sets.
创建时间:
2021-10-01



