krishnakamath/movielens-32m-sequential-recommender

Name: krishnakamath/movielens-32m-sequential-recommender
Creator: krishnakamath
Published: 2025-11-29 00:39:38
License: 暂无描述

Hugging Face2025-11-29 更新2025-12-20 收录

下载链接：

https://hf-mirror.com/datasets/krishnakamath/movielens-32m-sequential-recommender

下载链接

链接失效反馈

官方服务：

资源简介：

--- dataset_info: features: - name: input_sequence dtype: string - name: target_item dtype: string splits: - name: train num_bytes: 267855410 num_examples: 250000 - name: validation num_bytes: 158391666 num_examples: 50000 - name: test num_bytes: 159385395 num_examples: 50000 download_size: 240900536 dataset_size: 585632471 configs: - config_name: default data_files: - split: train path: data/train-* - split: validation path: data/validation-* - split: test path: data/test-* pretty_name: MovieLens 32M Sequential Recommender size_categories: - n<1M source_datasets: - movielens --- # MovieLens 32M Sequential Recommender Dataset This dataset is a processed version of the [MovieLens 32M dataset](https://grouplens.org/datasets/movielens/32m/), specifically formatted for sequential recommendation tasks. It contains user-item interaction sequences, enriched with rating and timestamp information, split into training, validation, and test sets. ## Dataset Structure The dataset is provided as a `DatasetDict` with three splits: `train`, `validation`, and `test`. Each split contains: - `input_sequence`: A string representing a user's interaction history. Each interaction is formatted as `movieId:rating:timestamp`. Sequences vary in length and starting points to provide diverse training examples. - `target_item`: A string representing the `movieId` of the next item the user interacted with, which the model is expected to predict. ## Generation Parameters This dataset was generated with the following parameters: - `NUM_USERS`: 50000 (Number of unique users included in the dataset) - `MAX_SEQUENCES_PER_USER`: 5 (Maximum number of training sequences sampled from each user's history) These parameters are also embedded in the dataset's metadata for reproducibility. ## Citation Please cite the original MovieLens dataset if you use this data in your research: F. Maxwell Harper and Joseph A. Konstan. 2015. The MovieLens Datasets: History and Context. ACM Transactions on Interactive Intelligent Systems (TiiS) 5, 4: 19:1–19:19. https://doi.org/10.1145/2827872 ## Acknowledgement The Python scripts used to generate and process this dataset were developed with the assistance of Google's Gemini.

提供机构：

krishnakamath

5,000+

优质数据集

54 个

任务类型

进入经典数据集