WordNet Hypernym-Hyponym Pairs
收藏arXiv2025-09-30 收录
下载链接:
https://github.com/GreenParachute/wordnet-randomwalk-python
下载链接
链接失效反馈官方服务:
资源简介:
该数据集源自WordNet,包含了所有的上位词-下位词对,使得研究者可以探索词汇之间的分类关系。为了保持平衡,数据集通过复制实例并交换上位词-下位词标签来实现。此外,还定义了一个包含25,000个实例的保留测试集,用于评估探针。该数据集的规模为493,494个实例(其中443,494个用于训练,50,000个用于测试)。任务目标是预测一对词中的哪个是上位词,哪个是下位词。
This dataset is derived from WordNet, encompassing all hyponym-hypernym pairs, which enables researchers to investigate the taxonomic relationships between lexical items. To maintain data balance, the dataset is constructed by duplicating instances and swapping the hyponym-hypernym labels. Additionally, a held-out test set with 25,000 instances is established for evaluating classification probes. The total size of this dataset is 493,494 instances, where 443,494 are allocated for training and 50,000 are reserved for testing. The task aims to predict which term in a given word pair functions as the hypernym and which serves as the hyponym.
提供机构:
Derived from WordNet



