five

DKS Datasets and Source Codes

收藏
NIAID Data Ecosystem2026-05-10 收录
下载链接:
https://data.mendeley.com/datasets/wfbnhy5gvw
下载链接
链接失效反馈
官方服务:
资源简介:
1. Five shells for running DKS on five standard real-world datasets: shell-citeseer.py, shell-cornell.py, shell-wisconsin.py, shell-toy.py, and shell-video.py. 2. Four source codes, including DKS code, contaminated graph generation code, incomplete graph generation code, and query generation code. 3. Five standard real-world datasets from various domains: (1) "CiteSeer" is a standard citation network dataset, where nodes represent documents, edges represent citation links, and keywords are the bag-of-words representation of papers. (2) "Cornell" and "Wisconsin" are two subdatasets of a webpage dataset collected from computer science departments of various universities, where nodes denote web pages, edges denote hyperlinks between nodes, and keywords are the bag-of-words representation of web pages. (3) "Toy" and "Video" are co-purchase networks. Their nodes denote the products, and the keywords are features of the product. An edge is built if two products are purchased by one customer. 4. Three extended datasets: (1) "Pubmed" is another standard citation network dataset, where nodes represent documents, edges represent citation links, and keywords are the bag-of-words representation of papers. (2) "Chameleon" and "Squirrel" are very dense heterogeneous knowledge-graph style datasets. In both Chameleon and Squirrel, nodes represent Wikipedia entries, edges represent links between entries, and keywords are descriptive terms for the entries.
创建时间:
2025-11-17
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作