five

IDN-Sum

收藏
NIAID Data Ecosystem2026-05-01 收录
下载链接:
https://zenodo.org/record/7083148
下载链接
链接失效反馈
官方服务:
资源简介:
Summarizing Interactive Digital Narratives (IDN) presents some unique challenges to existing text summarization models especially around capturing interactive elements in addition to important plot points. In this paper we describe the first IDN dataset (IDN-Sum) designed specifically for training and testing IDN text summarization algorithms. Our dataset is generated using random playthroughs of 8 IDN episodes, taken from 2 different IDN games, and consists of 10,000 documents. Playthrough documents are annotated through automatic alignment with fan-sourced summaries using a commonly used alignment algorithm. The dataset is released as open source for future researchers to train and test their own approaches for IDN text. Annotated Data folder contains the IDN-Sum data that was automatically annotated using the alignment algorithm. Subfolders hold different versions of data in the format suitable for input to BertSum (bs) from TransformerSum library (https://github.com/HHousen/TransformerSum) and SummaRuNNer(sr) (for implementation at https://github.com/hpzhao/SummaRuNNer). They are are named using convention [model_name]_[summary_length]. Unannotated playthroughs can be found in Cleaned Data folder.
创建时间:
2023-07-31
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作