Two Word Test (TWT)
收藏arXiv2023-06-08 更新2024-06-21 收录
下载链接:
https://github.com/NickRiccardi/two-word-test
下载链接
链接失效反馈官方服务:
资源简介:
Two Word Test (TWT) 是由南卡罗来纳大学的心理学系创建的一个开源数据集,用于评估大型语言模型(LLMs)的语义理解能力。该数据集包含1768个名词-名词组合,这些组合被150名人类参与者评定为有意义或无意义。TWT旨在通过简单的双词组合来测试LLMs的语义判断能力,不依赖于逻辑推理、规划或解谜等复杂任务。数据集的创建过程涉及从大量文本中筛选和生成双词组合,并通过人类评估来确定其意义性。TWT的应用领域主要集中在评估和改进LLMs的语义理解能力,特别是在区分有意义和无意义的语言组合方面。
Two Word Test (TWT) is an open-source dataset developed by the Department of Psychology at the University of South Carolina, designed to evaluate the semantic understanding capabilities of Large Language Models (LLMs). This dataset comprises 1,768 noun-noun combinations, which have been rated as either meaningful or meaningless by 150 human participants. TWT is intended to test the semantic judgment ability of LLMs via simple two-word combinations, without requiring complex tasks such as logical reasoning, planning, or puzzle solving. The dataset was constructed by screening and generating two-word combinations from a large text corpus, followed by human evaluation to validate their meaningfulness. The primary application scenarios of TWT focus on evaluating and improving the semantic understanding capabilities of LLMs, particularly in distinguishing between meaningful and meaningless linguistic combinations.
提供机构:
南卡罗来纳大学
创建时间:
2023-06-08



