ReVerb45K
收藏arXiv2025-09-30 收录
下载链接:
https://github.com/malllabiisc/cesi/tree/master/data/reverb45k
下载链接
链接失效反馈官方服务:
资源简介:
该数据集名为ReVerb45K,其中包含了从Clueweb09的源文本中通过ReVerb提取的45,000个三元组,所有名词短语(NPs)都标注了对应的Freebase实体。此外,数据集中包含了与20%选定的Freebase实体相关的三元组,这部分被用作验证集,而剩余的数据则用于测试。这是一个大规模的数据集,其任务是进行Okb标准化和链接。
This dataset is named ReVerb45K. It contains 45,000 triples extracted from the source texts of Clueweb09 via ReVerb, with all noun phrases (NPs) annotated with their corresponding Freebase entities. Additionally, the triples associated with 20% of the selected Freebase entities in this dataset are used as the validation set, while the remaining data is reserved for testing. As a large-scale dataset, its core task is Okb normalization and entity linking.



