Discourse Aware Scholarly Knowledge Graph Construction Dataset

Name: Discourse Aware Scholarly Knowledge Graph Construction Dataset
Creator: Zenodo
Published: 2026-05-06 16:06:01
License: 暂无描述

DataCite Commons2026-05-06 更新2026-05-07 收录

下载链接：

https://zenodo.org/doi/10.5281/zenodo.20055705

下载链接

链接失效反馈

官方服务：

资源简介：

Description The dataset is a manually curated benchmark for evaluating discourse-aware scholarly knowledge graph construction from scientific papers. It was created to support the development and evaluation of the Scholarly Upper Discourse Ontology (SUDO) and the SUDO-KG construction pipeline. The dataset is derived from research papers associated with the AMSR peer-review corpus. For each paper, the dataset includes source paper artifacts, parsed text representations, reviewer-facing metadata, and manually annotated gold-standard knowledge graph annotations. The annotations cover named entity spans and classes, finite-clause-level proposition spans and classes, and context-aware relations between artifacts, propositions, and proposition pairs. This dataset is suitable for research on: scholarly knowledge graph construction discourse-aware knowledge representation scientific information extraction ontology-guided annotation and evaluation grounded knowledge graph generation from scientific text evaluation of LLM-based and neuro-symbolic KGC pipeline Folder Structure The dataset is organized as one directory per paper. Each paper directory is named using a unique paper identifier. ```textgold_standard_dataset/v1/<paper_id>/annotation.jsonfact.jsonfacts.jsonpaper.grobid.tei.xmlpaper.mainbody.mdpaper.mdpaper.pdfreview.jsonsample.txt``` Main files: annotation.json: gold SUDO annotation used for KGC evaluation. review.json: paper review metadata used for abstract-test preparation. paper.pdf: original paper PDF. paper.grobid.tei.xml: GROBID-parsed TEI XML. paper.mainbody.md: Markdown text for the main body of the paper. fact.json : supporting fact-level annotations. sample.txt: auxiliary text sample for inspection or testing.

提供机构：

Zenodo

创建时间：

2026-05-06

5,000+

优质数据集

54 个

任务类型

进入经典数据集