SWS
收藏arXiv2025-09-30 收录
下载链接:
https://github.com/microsoft/smartwordsuggestions
下载链接
链接失效反馈官方服务:
资源简介:
该数据集分为三个子集:验证集、测试集由人工标注者标注,以及一个由人工生成的训练集。训练集由维基百科的句子构成,其标签是通过结合PPDB和梅里亚姆-韦伯斯特同义词词典获得的。此外,该数据集包含了人工标注的注释,旨在通过单词替换来评估句子的质量。这一任务被称为单词替换。
This dataset is divided into three subsets: the validation set and the test set, both annotated by human annotators, and a human-generated training set. The training set consists of sentences from Wikipedia, with its labels obtained by combining PPDB and the Merriam-Webster Thesaurus. Furthermore, this dataset includes human-annotated annotations intended to evaluate sentence quality via word substitution. This task is termed word substitution.



