five

Grammatical Features Inventory: Stem Index

收藏
DataCite Commons2020-08-03 更新2025-04-17 收录
下载链接:
http://www.smg.surrey.ac.uk/features/cite
下载链接
链接失效反馈
官方服务:
资源简介:
In attempting to understand language, many researchers use features, the elements into which linguistic units, such as words, can be broken down. Examples of features are NUMBER (singular, plural, dual, ...), PERSON (1st, 2nd, 3rd), and TENSE (present, past, ...). Features have proved invaluable for analysis and description, and have a major role in contemporary linguistics, from the most abstract theorising to the most applied computational applications. Yet little is firmly established about features: we have no inventory of which features are found in the world's languages, no agreed account of how they operate across different components of language, no certainty on how they interact, and thus no general theory of features. They are used, but are little discussed and poorly understood. This is a central gap in the conceptual underpinning of much linguistic investigation. The Grammatical Features Inventory is an attempt to put the notion of linguistic 'feature' on a sounder empirical and conceptual base. It aims to provide evidence for the diverse content of features in the world's languages, as well as discuss some of their formal properties, particularly in morphology (word structure) and syntax (sentence structure). Stem indexing features pick out stems relevant for particular parts of a paradigm. While stems may be phonologically closer or more distant to one another, we can generalise over them irrespective of their phonological similarity. They are to be set apart, since stem alternations may generalise over the different inflectional classes specified by inflectional class feature values. For discussion and references to further literature, see Corbett and Baerman (2006). This resource was created for the project 'Grammatical features: A key to understanding language', funded by the Economic and Social Research Council under grant number RES-051-27-0122. This support is gratefully acknowledged.

在探索语言本质的过程中,诸多研究者会采用语言特征(linguistic feature)——即可将单词等语言单位拆解得到的基本构成要素。常见的语言特征包括数(NUMBER,涵盖单数、复数、双数等)、人称(PERSON,涵盖第一、第二、第三人称等)与时态(TENSE,涵盖现在时、过去时等)。语言特征在分析与描述工作中价值无可替代,在当代语言学研究中占据核心地位,从最抽象的理论建构到最具实用性的计算应用均离不开其支撑。然而,目前学界对语言特征的认知仍相当有限:我们尚未编制出涵盖全球所有语言中存在的语言特征清单,未就其在语言不同子系统中的运作机制达成共识,也未能明确其相互作用的方式,因此至今尚未形成一套普适性的语言特征理论。语言特征虽被广泛使用,却鲜有深入讨论与透彻阐释,这已成为当前多数语言学研究的概念基础中一处核心空白。 本数据集语法特征清单(Grammatical Features Inventory)旨在为语言特征的概念提供更稳固的经验与理论基础。其目标在于为全球各语言中语言特征的多样内涵提供实证依据,同时探讨语言特征的部分形式属性,尤其聚焦于形态学(词结构)与句法学(句子结构)领域。 词干索引特征(stem indexing feature)用于选取与特定词形变化范式相关的词干。尽管不同词干在音系上可能存在亲疏差异,但我们可以不考虑其音系相似性,对其进行概括性研究。此类特征需与其他特征区分开来,因为词干交替现象可依据屈折类特征值所定义的不同屈折类进行概括。若需进一步探讨相关内容与参考文献,可参阅Corbett与Baerman(2006)的研究。 本资源为项目“语法特征:理解语言的关键”(Grammatical features: A key to understanding language)所创建,该项目由经济与社会研究委员会(Economic and Social Research Council)通过编号为RES-051-27-0122的资助项目支持。在此谨对该资助表示衷心感谢。
提供机构:
University of Surrey
创建时间:
2015-07-20
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作