five

An Open Dataset of Scholarly Publications Referenced in Selected Policy Documents (POLIDOC_SCHOLAR)

收藏
NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://zenodo.org/record/8184040
下载链接
链接失效反馈
官方服务:
资源简介:
POLIDOC_SCHOLAR:  An Open Dataset of Scholarly Publications Referenced in Selected Policy Documents This repository contains an open dataset of scholarly publications cited by selected policy documents. 1. Background: We do not aim to create a dataset of references for all policy documents or millions of policy documents but rather from a carefully selected set of policy documents. The long-term plan is to facilitate the inclusion of citations of scholarly publications in open bibliometric databases (or at least to create inter-operable datasets). In the short-term, we plan to increase the number of policy documents included in the dataset and continue to monitor and increase the data quality (completeness of records, provided external identifiers). We will also document - in the next release - the reference extraction process (including code used)   2.  Structure of the dataset: The dataset is structured into two primary categories: "Collections" and "Collection References." Collections: The metadata for selected policy documents is included the "collections.jsonl" file. The collection is a central feature of the POLIDOC_SCHOLAR dataset.  The selected policy documents are listed in the “collections.jsonl”. For instance, a collection might include reports like the IPCC reports of the 6th Cycle (the "IPCC_AR_6 collection") or the reports from IPBES (the "IPBES collection"). Within each collection, there are "documents." These can be twofold: They represent individual reports within a collection (e.g., the IPCC_AR_6 collection contains 6 reports: 3 assessment reports and 3 special reports from the 6th Cycle of the IPCC assessment). They also denote specific sections of these reports that contain bibliographic references. These sections can be chapters or other segments like supplementary materials or annexes (any section which has a reference list). Each document has a unique code, and the relationships between a main document and its subdivisions are indicated in the "is_part_of" field. Collection References: To allow users to access only the collections they are interested in, we've separated references by collection in files named "collection_reference_{…name of collection…}jsonl." Each of these files includes bibliographic references for every document in a specific collection. Besides presenting these as "reference strings" (in their original format within the document), we also offer unique identifiers like DOI and OpenAlex ID to facilitate linkage to external databases. The documentation of the dataset is provided in the file “data_dictionary” 3, Content release v1: This release (POLIDOC_SCHOLAR version 1) includes 2 collections: IPCC Assessment Cycle 6  IPBES Assessment reports     collection Number of reports Number of documents (“sections” with reference) Number of references (strings, not unique) Number of references with DOI (unique) Number of references with DOI (unique) 1 IPCC Assessment Cycle 6 6 103 94,958       51,713 48,695   2 IPBES Assessment reports 3 27 21,750     12,100   11,896
创建时间:
2024-07-11
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作