five

rohitsaxena/MENSA

收藏
Hugging Face2024-06-16 更新2024-06-29 收录
下载链接:
https://hf-mirror.com/datasets/rohitsaxena/MENSA
下载链接
链接失效反馈
官方服务:
资源简介:
MENSA(电影场景显著性数据集)来源于论文《Select and Summarize: Scene Saliency for Movie Script Summarization》,包含电影剧本及其对应的摘要。每个电影场景都标注了场景显著性标签。训练集包含自动生成的银标签,而验证集和测试集包含人工标注的金标签。数据集分为训练集、验证集和测试集三部分,分别包含不同标注类型的电影剧本和摘要。

The dataset, MENSA (Movie Scene Saliency Dataset) is from the paper Select and Summarize: Scene Saliency for Movie Script Summarization, and consists of movie scripts and their corresponding summaries. Each scene in the movie script is annotated with scene saliency labels. The training set contains silver labels, which are automatically generated, while the validation and test sets contain human-annotated gold labels. The dataset is divided into three parts: Training Set, Validation Set, and Test Set, each containing movie scripts and summaries with different types of annotations.
提供机构:
rohitsaxena
原始信息汇总

MENSA: Movie Scene Saliency Dataset

数据集概述

MENSA(Movie Scene Saliency Dataset)数据集来自论文《Select and Summarize: Scene Saliency for Movie Script Summarization》,包含电影剧本及其对应的摘要。每个场景都标注了场景显著性标签。训练集包含自动生成的银标签,而验证集和测试集包含人工标注的金标签。

数据集结构

数据集分为三部分:

  • 训练集:包含电影剧本和摘要,带有自动生成的银场景显著性标签。
  • 验证集:包含电影剧本和摘要,带有手工标注的金场景显著性标签。
  • 测试集:包含电影剧本和摘要,带有手工标注的金场景显著性标签。

许可证

Creative Commons Attribution Non Commercial 4.0

引用

@misc{saxena2024select, title={Select and Summarize: Scene Saliency for Movie Script Summarization}, author={Rohit Saxena and Frank Keller}, year={2024}, eprint={2404.03561}, archivePrefix={arXiv}, primaryClass={cs.CL} }

@inproceedings{saxena-keller-2024-select, title = "Select and Summarize: Scene Saliency for Movie Script Summarization", author = "Saxena, Rohit and Keller, Frank", editor = "Duh, Kevin and Gomez, Helena and Bethard, Steven", booktitle = "Findings of the Association for Computational Linguistics: NAACL 2024", month = jun, year = "2024", address = "Mexico City, Mexico", publisher = "Association for Computational Linguistics", pages = "3439--3455", }

5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作