soMLier Vivino Rating Data
收藏NIAID Data Ecosystem2026-03-13 收录
下载链接:
https://data.mendeley.com/datasets/dtbm7n6npz
下载链接
链接失效反馈官方服务:
资源简介:
This dataset consists of 278,765 ratings made by 92,514 users for 1,640 wines; all ratings (made in the interval [1, 5] in increments of 0.5) of these wines were scraped from Vivino.com during August 2021. This data can be used to develop and compare recommender systems that make use of collaborative filtering or matrix factorisation, for example. This data is already partitioned (at random) into training, validation and test sets in the proportions 70%, 20% and 10% respectively. Note that each user exists in the training set, but not all users are present in the validation or test sets.
The data can be extracted in R using load('VivinoRatingData.RData'). This data consists of a training ratings matrix 'set.train' consisting of 92,514 user rows and 1,640 wine columns where the column names correspond to the Vivino wine ID. 'known.position' is a list of all the matrix indices which contain a known value - some of these values are NA as they have been hidden in the valid or test sets. Likewise, 'test.position' and 'valid.position' contain the matrix indices of the ratings hidden in the test and valid sets, respectively. 'set.test' and 'set.valid' contain the rating values hidden from the matrix 'set.train'.
For example, set.train[valid.position[1:3]] = {Na, Na, Na} are the first 3 validation ratings hidden in the training set - their corresponding rating values are set.valid[1:3] = {4, 3.5, 4}.
创建时间:
2022-09-02



