five

A stochastic generative model for citation networks among academic papers

收藏
NIAID Data Ecosystem2026-03-13 收录
下载链接:
http://datadryad.org/dataset/doi%253A10.5061%252Fdryad.z8w9ghxfh
下载链接
链接失效反馈
官方服务:
资源简介:
We propose a stochastic generative model to represent a directed graph constructed by citations among academic papers, where nodes and directed edges represent papers with discrete publication time and citations respectively. The proposed model assumes that a citation between two papers occurs with a probability based on the type of the citing paper, the importance of cited paper, and the difference between their publication times, like the existing models. We consider the out-degrees of citing paper as its type, because, for example, survey paper cites many papers. We approximate the importance of a cited paper by its in-degrees. In our model, we adopt three functions: a logistic function for illustrating the numbers of papers published in discrete time, an inverse Gaussian probability distribution function to express the aging effect based on the difference between publication times, and an exponential distribution (or a generalized Pareto distribution) for describing the out-degree distribution. We consider that our model is a more reasonable and appropriate stochastic model than other existing models and can perform complete simulations without using original data. In this paper, we first use the Web of Science database and see the features used in our model. By using the proposed model, we can generate simulated graphs and demonstrate that they are similar to the original data concerning the in- and out-degree distributions, and node triangle participation. In addition, we analyze two other citation networks derived from physics papers in the arXiv database and verify the effectiveness of the model. Methods We focus on a subset of the Web of Science (WoS), WoS-Stat, which is a citation network that comprises the citations between papers published in journals whose subject is associated with “Statistics and Probability.” We construct a citation network utilizing a paper identifier (ID), publication year, and reference list (list of paper IDs) for 36 years, from 1981 to 2016. WoS-Stat consists of 179,483 papers and 1,106,622 citations.
创建时间:
2022-06-05
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作