five

Background Data for: The complexity principle and the morphosyntactic alternation between case affixes and postpositions in Estonian

收藏
DataCite Commons2026-01-05 更新2026-04-25 收录
下载链接:
https://dataverse.no/citation?persistentId=doi:10.18710/KDSZEP
下载链接
链接失效反馈
官方服务:
资源简介:
Manually annotated dataset of 3,000 uses of exterior locative constructions (specifically cases and postpositions) in present-day Estonian. The data is extracted from the Estonian National Corpus (ENC 2017; 1.1 billion words, mainly web-based texts). The data includes 500 uses of each of the following constructions: allative, adessive, ablative, peale, peal, pealt. The data sampling procedure and more details about the dataset is given in Klavan & Schützler (to appear in Cognitive Linguistics). The data is annotated for 9 variables: postpos (outcome variable: case, postposition), position (post, pre), complexity (simple, compound), length (length in syllables of landmark phrase), frequency (raw frequency of landmark form in association with the respective semantic relation), function (adverbial, modifier), verb_lemma (224 levels for lative, 279 levels for locative, 252 levels for separative), lm_lemma (592 levels for lative, 438 levels for locative, 528 levels for separative), sem_rel (lative, locative, separative). The dataset was collected by the PI of the project PUT1358 "The Making and Breaking of Models: Experimentally Validating Classification Models in Linguistics" (1.01.2017−31.12.2020) funded by the Estonian Research Council.
提供机构:
DataverseNO
创建时间:
2022-06-14
二维码
社区交流群
二维码
科研交流群
商业服务