CMU Movie Summary Corpus

Name: CMU Movie Summary Corpus
Creator: OpenDataLab
Published: 2026-05-17 06:30:07
License: 暂无描述

OpenDataLab2026-05-17 更新2024-05-09 收录

下载链接：

https://opendatalab.org.cn/OpenDataLab/CMU_Movie_Summary_Corpus

下载链接

链接失效反馈

官方服务：

资源简介：

数据集 [46 M] 和自述文件：从 Wikipedia 中提取的 42,306 个电影情节摘要 + 从 Freebase 中提取的对齐元数据，包括：电影票房收入、类型、发行日期、运行时间和语言角色名称和有关描绘他们的演员的对齐信息，包括电影上映时的性别和估计年龄补充：Stanford CoreNLP 处理的摘要 [628 M]。上面的所有情节摘要都通过斯坦福 CoreNLP 管道（标记、解析、NER 和 coref）运行。

Dataset [46 MB] and its README consist of 42,306 movie plot summaries extracted from Wikipedia, alongside aligned metadata retrieved from Freebase. The metadata covers: movie box office revenue, genre, release date, runtime, character names, and aligned information about the actors who portrayed these characters, including the actors' gender and estimated age at the time of the movie's release. Supplement: Plot summaries processed via Stanford CoreNLP [628 MB]. All the aforementioned plot summaries have been processed through the full Stanford CoreNLP pipeline, including tokenization, syntactic parsing, named entity recognition (NER), and coreference resolution.

提供机构：

OpenDataLab

创建时间：

2022-05-23

搜集汇总

数据集介绍