CSPubSum
收藏arXiv2025-09-30 收录
下载链接:
https://github.com/edco95/scientific-paper-summarisation
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含了从ScienceDirect上选取的10147篇计算机科学出版物的URL,内容涵盖了标题、摘要、作者撰写的研究亮点等多个部分。数据集被划分为训练集、验证集和测试集,分别包含8116、1017和1014个样本。为了提升研究亮点的自动生成效果,实验中采用了多种输入配置。该数据集的规模为10147篇出版物,任务目标是自动生成研究亮点。
This dataset contains URLs of 10,147 computer science publications selected from ScienceDirect, covering multiple sections including titles, abstracts, and author-written research highlights. The dataset is divided into training, validation, and test sets, with 8,116, 1,017, and 1,014 samples respectively. To enhance the performance of automatic research highlight generation, multiple input configurations were utilized in the experiments. This dataset consists of 10,147 publications in total, and its task objective is to automatically generate research highlights.
提供机构:
ScienceDirect



