five

A dataset of knowledge graph construction for patents, sci-tech achievements and papers in agriculture, industry and service industry based on sample data

收藏
DataCite Commons2026-04-09 更新2026-05-05 收录
下载链接:
https://www.scidb.cn/detail?dataSetId=eeca5486f99d4f92836f38ce7645b5ef
下载链接
链接失效反馈
官方服务:
资源简介:
As important carriers of innovation activities, patents, sci-tech achievements and papers play an increasingly prominent role in national political and economic development under the background of a new round of technological revolution and industrial transformation. However, in a distributed and heterogeneous environment, the integration and systematic description of patents, sci-tech achievements and papers data are still insufficient, which limits the in-depth analysis and utilization of related data resources. The dataset of knowledge graph construction for patents, sci-tech achievements and papers is an important means to promote innovation network research, and is of great significance for strengthening the development, utilization, and knowledge mining of innovation data. This work collected sample data on patents, sci-tech achievements and papers from China's authoritative websites spanning the three major industries—agriculture, industry, and services—during the period 2022-2025. After processes of cleaning, organizing, and normalization, a patents-sci-tech achievements-papers knowledge graph dataset was formed, containing 10 entity types and 8 types of entity relationships. To ensure quality and accuracy of data, the entire process involved strict preprocessing, semantic extraction and verification, with the ontology model introduced as the schema layer of the knowledge graph. The dataset establishes direct correlations among patents, sci-tech achievements and papers through inventors/contributors/authors, and utilizes the Neo4j graph database for storage and visualization. The open dataset constructed in this study can serve as important sample data for building knowledge graphs in the field of innovation, providing certain structured data support for innovation activity analysis, scientific research collaboration network analysis and knowledge discovery.The dataset consists of two parts. The first part includes three Excel tables: 1,794 patent records with 10 fields, 701 paper records with 7 fields, and 1,156 scientific and technological achievement records with 11 fields. The second part is a knowledge graph dataset in CSV format that can be imported into Neo4j, comprising 10 entity files and 8 relationship files.
提供机构:
Science Data Bank
创建时间:
2025-10-22
二维码
社区交流群
二维码
科研交流群
商业服务