ISWC 2023 LM-KBC Challenge Dataset

Name: ISWC 2023 LM-KBC Challenge Dataset
Creator: Wikidata
License: 暂无描述

arXiv2025-09-30 收录

下载链接：

https://github.com/bohuizhang/LLMKE

下载链接

链接失效反馈

官方服务：

资源简介：

该数据集包含了21种维基数据关系类型，覆盖了7个领域，包括音乐、电视剧、体育、地理、化学、商业、行政区划以及公众人物信息。总共有1,940条陈述用于训练、验证和测试。该数据集可用于研究大型语言模型与维基数据之间的知识差距，并通过维基数据的SPARQL查询生成了用于离线评估的基准真值。在规模上，训练、验证和测试集共包含了1,940条陈述。这项任务的目的是知识工程和知识库的完善。

This dataset includes 21 Wikidata relationship types, covering 7 domains including music, television series, sports, geography, chemistry, business, administrative divisions, and public figure information. A total of 1,940 statements are allocated for training, validation and testing. This dataset can be used to investigate the knowledge gap between large language models and Wikidata, and the ground truth for offline evaluation is generated via SPARQL queries conducted on Wikidata. In terms of scale, the training, validation and test sets collectively contain 1,940 statements. The purpose of this task is knowledge engineering and knowledge base enrichment.

提供机构：

Wikidata

5,000+

优质数据集

54 个

任务类型

进入经典数据集