E-learning Recommender System Dataset

Name: E-learning Recommender System Dataset
Creator: Harvard Dataverse
Published: 2025-05-12 00:03:08
License: 暂无描述

DataCite Commons2025-05-12 更新2025-05-17 收录

下载链接：

https://dataverse.harvard.edu/citation?persistentId=doi:10.7910/DVN/BMY3UD

下载链接

链接失效反馈

官方服务：

资源简介：

<p>Mandarine Academy Recommender System (MARS) Dataset is captured from real-world open MOOC {https://mooc.office365-training.com/}. The dataset offers both explicit and implicit ratings, for both French and English versions of the MOOC. Compared with classical recommendation datasets like Movielens, this is a rather small dataset due to the nature of available content (educational). However, the dataset offers insights into real-world ratings and provides testing grounds away from common datasets.</p> <p>All items are available online for viewing in both French and English versions. All selected users had rated at least 1 item. No demographic information is included. Each user is represented by an id and job (if available).</p> <p><br />For both French and English, the same kind of files is available in .csv format. We provide the following files:</p> <ul> <li><strong>Users</strong>: contains information about user ids and their jobs.</li> <li><strong>Items</strong>: contains information about items (resources) in the selected language. Contains a mix of feature types.</li> <li><strong>Ratings</strong>: Both <span style="text-decoration: underline;">explicit</span> (Watch time) and <span style="text-decoration: underline;">implicit</span> (page views of items).</li> </ul> <h3><strong>Formatting and Encoding</strong></h3> <p>The dataset files are written as&nbsp;<a href="http://en.wikipedia.org/wiki/Comma-separated_values">comma-separated values</a>&nbsp;files with a single header row. Columns that contain commas (,) are escaped using double quotes ("). These files are encoded as UTF-8.</p> <h3><strong>User Ids</strong></h3> <p>User ids are consistent between&nbsp;explicit_ratings.csv&nbsp;and implicit_ratings.csv&nbsp;and&nbsp;users.csv (i.e., the same id refers to the same user across the dataset).</p> <h3><strong>Item Ids</strong></h3> <p>Item ids are consistent between explicit_ratings.csv, implicit_ratings.csv, and items.csv (i.e., the same id refers to the same item across the dataset).</p> <h3><strong>Ratings Data File Structure</strong></h3> <p>All ratings are contained in the files explicit_ratings.csv and implicit_ratings.csv. Each line of this file after the header row represents one rating of one item by one user, and has the following format:</p> <ul> <li>item_id,user_id,created_at (implicit_ratings.csv)</li> <li>user_id,item_id,watch_percentage,created_at,rating (explicit_ratings.csv)</li> </ul> <h3><strong>Item Data File Structure</strong></h3> <p>Item information is contained in the file items.csv. Each line of this file after the header row represents one item, and has the following format:</p> <ul> <li>item_id,language,name,nb_views,description,created_at,Difficulty,Job,Software,Theme,duration,type</li> </ul>

提供机构：

Harvard Dataverse

创建时间：

2022-09-23

5,000+

优质数据集

54 个

任务类型

进入经典数据集