NEG-1500SIMP, ROLE-1500, NEG-1500SIMP-TEMP, NEG-1500SIMP-GEN
收藏arXiv2023-11-15 更新2024-06-21 收录
下载链接:
https://github.com/text-machinelab/extending_psycholinguistic_dataset
下载链接
链接失效反馈官方服务:
资源简介:
本研究扩展了心理语言学数据集,包括否定数据集NEG-1500SIMP和角色反转数据集ROLE-1500,分别由750对句子组成。此外,还创建了基于模板的否定数据集NEG-1500SIMP-TEMP和生成型否定数据集NEG-1500SIMP-GEN,分别包含770和750对句子。这些数据集通过GPT3扩展,旨在增强现有小规模数据集的统计能力,用于评估语言模型在否定和角色反转任务上的表现。数据集的应用领域主要集中在语言模型的能力测试,特别是在处理否定和角色反转等复杂语言现象时的性能。
This study expands the psycholinguistic dataset suite, comprising the negation dataset NEG-1500SIMP and the role reversal dataset ROLE-1500, each containing 750 sentence pairs. Additionally, two more negation datasets are constructed: the template-based NEG-1500SIMP-TEMP and the generative NEG-1500SIMP-GEN, with 770 and 750 sentence pairs respectively. All these datasets are augmented via GPT3, with the goal of improving the statistical power of existing small-scale datasets, and are designed to assess the performance of language models on negation and role reversal tasks. The primary application scenarios of these datasets center on language model capability evaluation, particularly for measuring model performance when dealing with complex linguistic phenomena including negation and role reversal.
提供机构:
马萨诸塞大学洛厄尔分校计算机科学系
创建时间:
2023-03-29



