five

E-learning Recommender System Dataset

收藏
DataCite Commons2025-05-12 更新2025-05-17 收录
下载链接:
https://dataverse.harvard.edu/citation?persistentId=doi:10.7910/DVN/BMY3UD
下载链接
链接失效反馈
官方服务:
资源简介:
<p>Mandarine Academy Recommender System (MARS) Dataset is captured from real-world open MOOC {https://mooc.office365-training.com/}. The dataset offers both explicit and implicit ratings, for both French and English versions of the MOOC. Compared with classical recommendation datasets like Movielens, this is a rather small dataset due to the nature of available content (educational). However, the dataset offers insights into real-world ratings and provides testing grounds away from common datasets.</p> <p>All items are available online for viewing in both French and English versions. All selected users had rated at least 1 item. No demographic information is included. Each user is represented by an id and job (if available).</p> <p><br />For both French and English, the same kind of files is available in .csv format. We provide the following files:</p> <ul> <li><strong>Users</strong>: contains information about user ids and their jobs.</li> <li><strong>Items</strong>: contains information about items (resources) in the selected language. Contains a mix of feature types.</li> <li><strong>Ratings</strong>: Both <span style="text-decoration: underline;">explicit</span> (Watch time) and <span style="text-decoration: underline;">implicit</span> (page views of items).</li> </ul> <h3><strong>Formatting and Encoding</strong></h3> <p>The dataset files are written as <a href="http://en.wikipedia.org/wiki/Comma-separated_values">comma-separated values</a> files with a single header row. Columns that contain commas (,) are escaped using double quotes ("). These files are encoded as UTF-8.</p> <h3><strong>User Ids</strong></h3> <p>User ids are consistent between explicit_ratings.csv and implicit_ratings.csv and users.csv (i.e., the same id refers to the same user across the dataset).</p> <h3><strong>Item Ids</strong></h3> <p>Item ids are consistent between explicit_ratings.csv, implicit_ratings.csv, and items.csv (i.e., the same id refers to the same item across the dataset).</p> <h3><strong>Ratings Data File Structure</strong></h3> <p>All ratings are contained in the files explicit_ratings.csv and implicit_ratings.csv. Each line of this file after the header row represents one rating of one item by one user, and has the following format:</p> <ul> <li>item_id,user_id,created_at (implicit_ratings.csv)</li> <li>user_id,item_id,watch_percentage,created_at,rating (explicit_ratings.csv)</li> </ul> <h3><strong>Item Data File Structure</strong></h3> <p>Item information is contained in the file items.csv. Each line of this file after the header row represents one item, and has the following format:</p> <ul> <li>item_id,language,name,nb_views,description,created_at,Difficulty,Job,Software,Theme,duration,type</li> </ul>
提供机构:
Harvard Dataverse
创建时间:
2022-09-23
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作