CoarseWSD-20
收藏arXiv2025-09-30 收录
下载链接:
https://github.com/danlou/bert-disambiguation
下载链接
链接失效反馈官方服务:
资源简介:
该数据集名为CoarseWSD-20,包含20个单词,平均每个单词有2.65个不同的语义含义。该数据集分为训练集和测试集,分别包含23,370个和10,196个句子。此外,该数据集专注于20个单词的有限集合,这些单词各自具有多个含义。在规模上,训练集包含23,370个句子,测试集包含10,196个句子。该数据集的任务是词义消歧。
This dataset is named CoarseWSD-20, which includes 20 words, with an average of 2.65 distinct word senses per word. It is split into a training set and a test set, containing 23,370 and 10,196 sentences respectively. Focused on a limited set of these 20 polysemous words, the core task of this dataset is word sense disambiguation.



