Multi-XScience
收藏arXiv2025-09-30 收录
下载链接:
https://github.com/yaolu/multi-xscience
下载链接
链接失效反馈官方服务:
资源简介:
该数据集名为Multi-XScience,是一个大规模的多文档摘要数据集,它由科学论文构建而成,专注于根据论文摘要及其引用的文章编写论文的相关工作部分。该数据集旨在最大化对抽象模型的实用性,并包含高水平的抽象性,这使得它对于摘要模型来说颇具挑战性。它是一个大规模的数据集,所涉及的任务是多文档摘要。
Named Multi-XScience, this dataset is a large-scale multi-document summarization dataset constructed from scientific papers. It focuses on generating the "Related Work" section of a target scientific paper using the paper's own abstract and the scholarly articles it cites. This dataset aims to maximize its practical value for summarization models, and it exhibits a high degree of abstraction, which poses significant challenges for such models. It is a large-scale dataset centered on the task of multi-document summarization.
提供机构:
arXiv.org, Microsoft Academic Graph



