Datasets for NLP tasks on scholarly articles

Name: Datasets for NLP tasks on scholarly articles
Creator: Open Research Knowledge Graph
Published: 2022-12-23 15:31:50
License: 暂无描述

DataCite Commons2022-12-23 更新2025-04-16 收录

下载链接：

https://orkg.org/comparison/R280270/

下载链接

链接失效反馈

官方服务：

资源简介：

This comparison surveys existing datasets for NLP tasks (mainly IE, NER, and summarization) on scholarly articles. It includes information such as annotation level, generation method, data sources, and the concepts and relations that were annotated in each dataset, along with inter-annotator agreement scores. It also includes the baselines and models that were used, as well as their results.

本对比研究梳理了面向学术文章的自然语言处理（Natural Language Processing, NLP）任务现有数据集，主要涵盖信息抽取（Information Extraction, IE）、命名实体识别（Named Entity Recognition, NER）与文本摘要三类核心任务。该调研收录的信息包括各数据集的标注层级、数据生成方式、数据来源，以及各数据集中标注的概念与关联关系，同时附带标注者间一致性评分（inter-annotator agreement scores）。此外，该调研还收录了各任务中使用的基准方法与模型，及其对应的性能结果。

提供机构：

Open Research Knowledge Graph

创建时间：

2022-12-23

5,000+

优质数据集

54 个

任务类型

进入经典数据集