five

Data for: NILK, entity linking dataset targeting NIL-linking cases

收藏
Mendeley Data2024-03-27 更新2024-06-30 收录
下载链接:
https://darus.uni-stuttgart.de/citation?persistentId=doi:10.18419/darus-3454
下载链接
链接失效反馈
官方服务:
资源简介:
A dataset for the NIL-detection and NIL-disambiguation tasks. The NILK dataset has two main features: 1) It marks NIL-mentions for NIL-detection by extracting mentions which belong to newly added entities in Wikipedia text. 2) It provides an entity label for NIL-disambiguation by marking NIL-mentions with WikiData IDs from the newer dump. Dataset files contain JSON objects of the following structure: {"mention":"Walter Damrosch", "offset":348, "length":15, "context":"...the conductor Walter Damrosch. He scored the piece for the standard instruments of the symphony orchestra plus celesta, saxophone, and automobile horns...", "wikipedia_page_id":"309", "wikidata_id":"Q725579", "nil":false} The dataset contains both linked and not linked mentions, one can distinguish between them by checking "nil" flag. To obtain NIL-mentions, we compared two WikiData dumps: from 2017 and 2021. NIL-mentions have WikiData ID from WikiData 2021, one can use it to check whether these mentions refer to the same entity.
创建时间:
2023-07-14
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作