Background data (adapted from Jenset & McGillivray 2017) for: Down-sampling from hierarchically structured corpus data

Name: Background data (adapted from Jenset & McGillivray 2017) for: Down-sampling from hierarchically structured corpus data
Creator: University of Bamberg
Published: 2025-07-17 00:00:00
License: 暂无描述

DataverseNO2025-07-17 更新2026-04-13 收录

下载链接：

https://dataverse.no/citation?persistentId=doi:10.18710/5KCE4U

下载链接

链接失效反馈

官方服务：

资源简介：

Dataset description This dataset, which is adapted from Jenset and McGillivray (2017), contains tabular files documenting the alternating usage of -(e)th and -(e)s to mark third-person verb inflection in Early Modern English. The data provided by Jenset and McGillivray (2017) are drawn from the PPCEME corpus (Kroch et al. 2004) and cover the period from 1500 to 1700. In total, 13,757 third-person singular tokens (excluding the verb BE) were annotated by these authors for a range of variables. For the purposes of the present methodological study, this dataset was reduced to a subset of 11,645 tokens, and the coding of variables was in some parts revised, completed, or modified. The dataset includes information about the Author and Verb Lemma, as well as a number of predictor variables, including Genre, Year, Frequency (of the verb lemma in the third-person singular), Phonological Context (stem-final sound), and the Gender of the author.

提供机构：

University of Bamberg

创建时间：

2023-01-01

Background data (adapted from Jenset &amp; McGillivray 2017) for: Down-sampling from hierarchically structured corpus data

Background data (adapted from Jenset & McGillivray 2017) for: Down-sampling from hierarchically structured corpus data