five

Gollum: A Gold Standard for Large Scale\\Multi Source Knowledge Graph Matching

收藏
NIAID Data Ecosystem2026-05-01 收录
下载链接:
https://zenodo.org/record/6523399
下载链接
链接失效反馈
官方服务:
资源简介:
The set of Knowledge Graphs (KGs) generated with automatic and manual approaches is constantly growing. For an integrated view and usage, an alignment between these KGs is necessary on the schema as well as instance level. There are already approaches which try to tackle this multi source knowledge graph matching problem, but large gold standards are missing to evaluate their effectiveness and scalability. In particular, most existing gold standards are fairly small and can be solved by matchers which match exactly two KGs (1:1), which are the majority of existing matching systems. We close this gap by presenting Gollum -- a gold standard for large-scale multi source knowledge graph matching with over 275,000 correspondences between 4,149 different KGs. They originate from knowledge graphs derived by applying the DBpedia extraction framework to a large wiki farm. Three variations of the gold standard are made available: (1) a version with all correspondences for evaluating unsupervised matching approaches, and two versions for evaluating supervised matching: (2) one where each KG is contained both in the train and test set, and (3) one where each KG is exclusively contained in the train or the test set. We plan to extend our KG track at the Ontology Alignment Evaluation Initiative (OAEI) to allow for matching systems  which are specifically designed to solve the multi KG matching problem. As a first step towards this direction, we evaluate multi source matching approaches which reuse two-KG (1:1) matchers from the past OAEI.   Due to the size of the KG files, they are hosted at the institute: http://data.dws.informatik.uni-mannheim.de/dbkwik/gollum/40K.tar    (50,3 GB) http://data.dws.informatik.uni-mannheim.de/dbkwik/gollum/all.tar      (74,7 GB) http://data.dws.informatik.uni-mannheim.de/dbkwik/gollum/gold.tar   (25,3 GB)
创建时间:
2023-04-11
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作