five

Key generic technology prediction in patent citation using graph neural networks

收藏
NIAID Data Ecosystem2026-05-01 收录
下载链接:
http://datadryad.org/dataset/doi%253A10.5061%252Fdryad.nk98sf803
下载链接
链接失效反馈
官方服务:
资源简介:
With the rapid advancement of the Fourth Industrial Revolution, international competition in technology and industry is intensifying. However, in the era of big data and large-scale science, making accurate judgments about the key areas of technology and innovative trends has become exceptionally difficult. This paper constructs a patent indicator evaluation system based on the dimensions of key and generic patent citation, integrates graph neural network modeling to predict key common technologies, and confirms the effectiveness of the method using the field of genetic engineering as an example. According to the LDA topic model, the main technical R&D directions in genetic engineering are genetic analysis and detection technologies, the application of microorganisms in industrial production, virology research involving vaccine development and immune responses, high-throughput sequencing and analysis technologies in genomics, targeted drug design and molecular therapeutic strategies, genetically modified crop improvement. The accuracy of predicting key generic technologies related to graph neural networks reaches 97.67%. Based on patent citation theory and the graph neural network models, this paper considers the structural and technical attributes of cited patents, providing theoretical and empirical evidence for technology prediction, and possessing certain theoretical and practical value. Methods These datasets were obtained by the Incopat patent database for cited patents (2013-2022) in the field of genetic engineering. Details for the datasets are provided in the README file. This directory contains the selection of the patent datasets. 1) Table of key generic indicators for nodes (partial 1).csv This file consists of 10 indicators of patents: technical coverage, patent families, patent family citation, patent cooperation, enterprise-enterprise cooperation, industry-university-research cooperation, claims, citation frequency, layout countries, and layout countries. 2) Table of key generic indicators for nodes (partial 2).csv This file consists of 10 indicators of patents: technical convergence, cited countries, inventors, citations, homologous countries/areas, degree centrality, closeness centrality, betweenness centrality, eigenvector centrality, and PageRank. 3) patent.content The content file contains descriptions of the patents in the following format: <ID_number> <technical_attributes> + <class_label>. Each line contains two patent ID numbers. The first entry is the ID number of the patent being cited and the second publish number stands for the patent which contains the citation. The direction of the link is from right to left. If a line is represented by "patent1 patent2" then the link is "patent2->patent1". 4) patent.cites The first entry in each line contains the unique string ID number of the patents followed by binary values indicating whether the value of each patent exceeds the average of the corresponding indicator (indicated by 1) or absent (indicated by 0) in the patent. Finally, the last entry in the line contains the class label of the patent. 5) Graph neural network modeling highest accuracy for different dimensions.csv This file shows the best accuracies of GCN, SAGE, and GAT models in different dimensions. 6) Prediction effects of key generic technologies.csv This file shows the accuracies of GCN, SAGE, and GAT models in different epochs.
创建时间:
2024-01-11
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作