T2Dv2
收藏arXiv2025-09-30 收录
下载链接:
http://webdatacommons.org/webtables/goldstandardV2.html
下载链接
链接失效反馈官方服务:
资源简介:
该数据集名为T2Dv2,包含了来自网络中的常见表格,其中主键实体列均被精细标注了DBPedia类别,并对非主键实体列进行了扩展标注。此外,该数据集不仅包含了“最佳”类别注释,也包含了“尚可”类别的注释。具体规模上,数据集包含了237个主键实体列和174个非主键实体列。其任务是对列类型进行预测和注释。
The dataset, named T2Dv2, consists of common web tables crawled from the Internet. All primary entity columns are meticulously annotated with DBPedia categories, and non-primary entity columns are additionally provided with extended annotations. Furthermore, this dataset includes annotations for both the "best" and "acceptable" categories. In terms of scale, the dataset contains 237 primary entity columns and 174 non-primary entity columns. The core task supported by this dataset is column type prediction and annotation.
提供机构:
WebDataCommons



