DKS Datasets and Source Codes
收藏NIAID Data Ecosystem2026-05-10 收录
下载链接:
https://data.mendeley.com/datasets/wfbnhy5gvw
下载链接
链接失效反馈官方服务:
资源简介:
1. Five shells for running DKS on five standard real-world datasets:
shell-citeseer.py, shell-cornell.py, shell-wisconsin.py, shell-toy.py, and shell-video.py.
2. Four source codes, including DKS code, contaminated graph generation code, incomplete graph generation code, and query generation code.
3. Five standard real-world datasets from various domains:
(1) "CiteSeer" is a standard citation network dataset, where nodes represent documents, edges represent citation links, and keywords are the bag-of-words representation of papers.
(2) "Cornell" and "Wisconsin" are two subdatasets of a webpage dataset collected from computer science departments of various universities, where nodes denote web pages, edges denote hyperlinks between nodes, and keywords are the bag-of-words representation of web pages.
(3) "Toy" and "Video" are co-purchase networks. Their nodes denote the products, and the keywords are features of the product. An edge is built if two products are purchased by one customer.
4. Three extended datasets:
(1) "Pubmed" is another standard citation network dataset, where nodes represent documents, edges represent citation links, and keywords are the bag-of-words representation of papers.
(2) "Chameleon" and "Squirrel" are very dense heterogeneous knowledge-graph style datasets. In both Chameleon and Squirrel, nodes represent Wikipedia entries, edges represent links between entries, and keywords are descriptive terms for the entries.
创建时间:
2025-11-17



