five

WikiLinkGraphs: A complete, longitudinal and multilanguage dataset of the Wikipedia link networks

收藏
NIAID Data Ecosystem2026-03-11 收录
下载链接:
https://zenodo.org/record/2539423
下载链接
链接失效反馈
官方服务:
资源简介:
This dataset contains yearly snapshots of the Wikipedia's internal link network for the 9 largest language edition (de, en, es, fr, it, nl, pl, ru, sv). The dataset spans over 17 years, from the creation of Wikipedia in 2001 to March 2018. The snapshots are taken on March 1st of every year. The graphs include the links extract from the wikitext of each page (i.e in the form [[wikilink]]). Links transcluded from templates are not included. Redirects are resolved to their target page. More detailed information and supporting datasets are available at: http://disi.unitn.it/~consonni/datasets/. IMPORTANT NOTICE Gzipped files are compressed two times by Zenodo, the MD5 provided by Zenodo and the SHA512 sums provided in the `.sha512sums.txt` files, match with the files compressed once. In other words, when you download a `.gz` file save it as `.gz.gz`, uncompress it once and it should match both the MD5 provided by Zenodo and the SHA512 sum provided by us. We have opened a bug report for this behavior on Zenodo's repository at: https://github.com/zenodo/zenodo/issues/1705
创建时间:
2020-01-24
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作