rohitsaxena/MENSA

Name: rohitsaxena/MENSA
Creator: rohitsaxena
Published: 2024-06-16 12:13:43
License: 暂无描述

Hugging Face2024-06-16 更新2024-06-29 收录

下载链接：

https://hf-mirror.com/datasets/rohitsaxena/MENSA

下载链接

链接失效反馈

官方服务：

资源简介：

MENSA（电影场景显著性数据集）来源于论文《Select and Summarize: Scene Saliency for Movie Script Summarization》，包含电影剧本及其对应的摘要。每个电影场景都标注了场景显著性标签。训练集包含自动生成的银标签，而验证集和测试集包含人工标注的金标签。数据集分为训练集、验证集和测试集三部分，分别包含不同标注类型的电影剧本和摘要。

The dataset, MENSA (Movie Scene Saliency Dataset) is from the paper Select and Summarize: Scene Saliency for Movie Script Summarization, and consists of movie scripts and their corresponding summaries. Each scene in the movie script is annotated with scene saliency labels. The training set contains silver labels, which are automatically generated, while the validation and test sets contain human-annotated gold labels. The dataset is divided into three parts: Training Set, Validation Set, and Test Set, each containing movie scripts and summaries with different types of annotations.

提供机构：

rohitsaxena

原始信息汇总

MENSA: Movie Scene Saliency Dataset

数据集概述

MENSA（Movie Scene Saliency Dataset）数据集来自论文《Select and Summarize: Scene Saliency for Movie Script Summarization》，包含电影剧本及其对应的摘要。每个场景都标注了场景显著性标签。训练集包含自动生成的银标签，而验证集和测试集包含人工标注的金标签。

数据集结构

数据集分为三部分：

训练集：包含电影剧本和摘要，带有自动生成的银场景显著性标签。
验证集：包含电影剧本和摘要，带有手工标注的金场景显著性标签。
测试集：包含电影剧本和摘要，带有手工标注的金场景显著性标签。

许可证

Creative Commons Attribution Non Commercial 4.0

引用

@misc{saxena2024select, title={Select and Summarize: Scene Saliency for Movie Script Summarization}, author={Rohit Saxena and Frank Keller}, year={2024}, eprint={2404.03561}, archivePrefix={arXiv}, primaryClass={cs.CL} }

@inproceedings{saxena-keller-2024-select, title = "Select and Summarize: Scene Saliency for Movie Script Summarization", author = "Saxena, Rohit and Keller, Frank", editor = "Duh, Kevin and Gomez, Helena and Bethard, Steven", booktitle = "Findings of the Association for Computational Linguistics: NAACL 2024", month = jun, year = "2024", address = "Mexico City, Mexico", publisher = "Association for Computational Linguistics", pages = "3439--3455", }

5,000+

优质数据集

54 个

任务类型

进入经典数据集