MovieLens 20M Posters and Subtitles Multi-modal
收藏NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://zenodo.org/record/14571725
下载链接
链接失效反馈官方服务:
资源简介:
Multi-modal composite dataset derived from the well-established MovieLens 20M dataset, which provides 20 million movie ratings and tagging activities collected through the MovieLens project. While MovieLens 20M is rich in user-movie interaction data, it lacks multi-modal characteristics. To address this limitation, we have enhanced the dataset by integrating additional modalities from complementary sources.
Visual data, in the form of movie posters, was obtained from the PosterLens 25M dataset , which associates MovieLens movies with corresponding poster images and precomputed ResNet-34 embeddings.
Textual data was introduced through subtitles sourced from the Sublens-20M dataset, which provides detailed subtitle files for 71\% of the movies in MovieLens 20M and covers 98\% of user interactions.
Graph data, including comprehensive cast and crew information, was incorporated from The Movies Dataset, originally extracted from The Movie Database (TMDB)\footnote{\url{https://www.themoviedb.org}}, to provide detailed contextual insights into each movie
All modalities have been referenced directly to the MovieLens movieId.
创建时间:
2024-12-29



