five

Cross-language Wikipedia link graph

收藏
NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://zenodo.org/record/7163079
下载链接
链接失效反馈
官方服务:
资源简介:
Wikipedia articles use Wikidata to list the links to the same article in other language versions. Therefore, each Wikipedia language edition stores the Wikidata Q-id for each article. This dataset constitutes a Wikipedia link graph where all the article identifiers are normalized to Wikidata Q-ids. It contains the normalized links from all Wikipedia language versions. Detailed link count statistics are attached. Note that articles that have no incoming nor outgoing links are not part of this graph. The format is as follows: Q-id of linking page (outgoing) Q-id of linked page (incoming) language version - dump date (20241101) This dataset was used to compute Wikidata PageRank. More information can be found on the danker repository, where the source code of the link extraction as well as the PageRank computation is hosted. Example entries:$ bzcat 2024-11-06.allwiki.links.bz2 | head 1    107    ckbwiki-202411011    107    lawiki-202411011    107    ltwiki-202411011    107    tewiki-202411011    107    wuuwiki-202411011    111    hywwiki-202411011    11379    bat_smgwiki-202411011    11471    cdowiki-202411011    150    ckbwiki-202411011    150    lowiki-20241101
创建时间:
2024-11-13
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作