five

Datasets for NLP tasks on scholarly articles

收藏
DataCite Commons2022-12-23 更新2025-04-16 收录
下载链接:
https://orkg.org/comparison/R280270/
下载链接
链接失效反馈
官方服务:
资源简介:
This comparison surveys existing datasets for NLP tasks (mainly IE, NER, and summarization) on scholarly articles. It includes information such as annotation level, generation method, data sources, and the concepts and relations that were annotated in each dataset, along with inter-annotator agreement scores. It also includes the baselines and models that were used, as well as their results.

本对比研究梳理了面向学术文章的自然语言处理(Natural Language Processing, NLP)任务现有数据集,主要涵盖信息抽取(Information Extraction, IE)、命名实体识别(Named Entity Recognition, NER)与文本摘要三类核心任务。该调研收录的信息包括各数据集的标注层级、数据生成方式、数据来源,以及各数据集中标注的概念与关联关系,同时附带标注者间一致性评分(inter-annotator agreement scores)。此外,该调研还收录了各任务中使用的基准方法与模型,及其对应的性能结果。
提供机构:
Open Research Knowledge Graph
创建时间:
2022-12-23
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作