five

tBiodivL: Larger Semantic Table Annotations Benchmark for Biodiversity Domain

收藏
NIAID Data Ecosystem2026-05-01 收录
下载链接:
https://zenodo.org/record/10283082
下载链接
链接失效反馈
官方服务:
资源简介:
tBiodivL is a dataset for tabular data to knowledge graph matching. It is derived from the Biodiversity domain and has two types of tables. On the one hand, Horizontal Relational Tables are where each table represents a collection of entities. On the other hand, Entity Tables represent a single entity. We supported ground truth data from Wikidata as a target knowledge graph (KG). tBiodivL is generated by KG2Tables using 10 levels of a recursive hierarchy of related concepts in Wikidata. It is the successor work of tBiodiv tBiodivL contains 222,353 entity and horizontal tables, while this repository contains only a sample of 1% of the total generated tables of the entire benchmark with its ground truth data (gt). The Full size of this dataset is 312 GB. We will update this repository with the full dataset in the Future. Please get in touch if you are interested in the full dataset,  The supported tasks for semantic table annotations are:  Topic Detection (TD) links the entire table to an entity or a class from the target KG. Cell Entity Annotation (CEA) maps individual table cells to entities from the target KG. Column Type Annotation (CTA) links individual table columns to classes from the target KG. Column Property Annotation (CPA) detects the relations between column pairs from the target knowledge graph. Row Annotation (RA) annotates the entire row to a KG entity or property.
创建时间:
2023-12-07
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作