five

Video Recommendations Based on Visual Features Extracted with Deep Learning

收藏
NIAID Data Ecosystem2026-03-12 收录
下载链接:
https://zenodo.org/record/4889728
下载链接
链接失效反馈
官方服务:
资源简介:
The dataset contains visual features extracted from 12875 movie trailers. The visual features are extracted from key-frames of movie trailers with the VGG-19 CNN, pre-trained on ImageNet. Movies in the datset are identified by their MovieLens movieId.   Features_sparse.zip contains the 4096-dimensional feature vectors of each key-frame from every movie. Visual labels.zip contains the1000 dimensional label feature vectors of each key-frame from every movie. DeepCineProp-f.p has combined the label features of each movie into a vector space model with the use of tf-idf. CineSub.p contains the subtitles of each movie represented in a vector space model pre-processed with various nlp techniques and produced using tf-idf.   Abstract: When a movie is uploaded to a movie Recommender System (e.g., YouTube), the system can exploit various forms of descriptive features (e.g., tags and genre) in order to generate personalized recommendation for users. However, there are situations where the descriptive features are missing or very limited and the system may fail to include such a movie in the recommendation list, known as Cold-start problem. This thesis investigates recommendation based on a novel form of content features, extracted from movies, in order to generate recommendation for users. Such features represent the visual aspects of movies, based on Deep Learning models, and hence, do not require any human annotation when extracted. The proposed technique has been evaluated in both offline and online evaluations using a large dataset of movies. The online evaluation has been carried out in a evaluation framework developed for this thesis. Results from the offline and online evaluation (N=150) show that automatically extracted visual features can mitigate the cold-start problem by generating recommendation with a superior quality compared to different baselines, including recommendation based on human-annotated features. The results also point to subtitles as a high-quality future source of automatically extracted features.
创建时间:
2021-06-02
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作