five

Background Data for: The complexity principle and the morphosyntactic alternation between case affixes and postpositions in Estonian

收藏
DataONE2026-01-05 更新2026-01-17 收录
下载链接:
https://search.dataone.org/view/sha256:52bbcb1f5f7b68df434fd3f960fac9ab182bcd313370e589f9f7201aab925bb3
下载链接
链接失效反馈
官方服务:
资源简介:
Manually annotated dataset of 3,000 uses of exterior locative constructions (specifically cases and postpositions) in present-day Estonian. The data is extracted from the Estonian National Corpus (ENC 2017; 1.1 billion words, mainly web-based texts). The data includes 500 uses of each of the following constructions: allative, adessive, ablative, peale, peal, pealt. The data sampling procedure and more details about the dataset is given in Klavan & Schützler (to appear in Cognitive Linguistics). The data is annotated for 9 variables: postpos (outcome variable: case, postposition), position (post, pre), complexity (simple, compound), length (length in syllables of landmark phrase), frequency (raw frequency of landmark form in association with the respective semantic relation), function (adverbial, modifier), verb_lemma (224 levels for lative, 279 levels for locative, 252 levels for separative), lm_lemma (592 levels for lative, 438 levels for locative, 528 levels for separative), sem_rel (lative, locative, separative). The dataset was collected by the PI of the project PUT1358 \"The Making and Breaking of Models: Experimentally Validating Classification Models in Linguistics\" (1.01.2017−31.12.2020) funded by the Estonian Research Council.
创建时间:
2026-01-06
二维码
社区交流群
二维码
科研交流群
商业服务