LI-Adger
收藏arXiv2025-09-30 收录
下载链接:
https://www.jonsprouse.com/data/Lingua2013/SSA.data.zip
下载链接
链接失效反馈官方服务:
资源简介:
该数据集名为LI-Adger,是一个包含519种句子类型的综合集合,其中300种来自《语言探究》杂志,219种来自Adger的《核心语法》教科书。该数据集包含手工构建的、语义上合理的句子,旨在探究语法结构的基础。相较于之前的数据集,该数据集通过控制语义不合理性并提供了更广泛的句法现象覆盖,从而进行了改进。它还包括通过使用大小估计任务和Likert量表收集的人类判断。规模上,该数据集包含519种句子类型和2391个独特的最小对。其任务是对语言模型在人类语言习得方面的句法能力进行评估。
The dataset named LI-Adger is a comprehensive collection of 519 sentence types, 300 of which are sourced from the journal *Language Inquiry* and the other 219 from Adger's textbook *Core Grammar*. This dataset consists of manually constructed, semantically plausible sentences intended to explore the foundational aspects of syntactic structure. Compared with previous datasets, this resource has been improved by controlling for semantic implausibility and providing broader coverage of syntactic phenomena. It also includes human judgment data collected via magnitude estimation tasks and Likert scales. In terms of scale, the dataset contains 519 sentence types and 2391 unique minimal pairs. Its core task is to evaluate the syntactic abilities of language models with respect to human language acquisition.



