Background data (adapted from Jenset & McGillivray 2017) for: Down-sampling from hierarchically structured corpus data
收藏DataverseNO2025-07-17 更新2026-04-13 收录
下载链接:
https://dataverse.no/citation?persistentId=doi:10.18710/5KCE4U
下载链接
链接失效反馈官方服务:
资源简介:
<p><strong>Dataset description</strong></p>
<p>This dataset, which is adapted from Jenset and McGillivray (2017), contains tabular files documenting the alternating usage of -(e)th and -(e)s to mark third-person verb inflection in Early Modern English. The data provided by Jenset and McGillivray (2017) are drawn from the PPCEME corpus (Kroch et al. 2004) and cover the period from 1500 to 1700. In total, 13,757 third-person singular tokens (excluding the verb BE) were annotated by these authors for a range of variables. For the purposes of the present methodological study, this dataset was reduced to a subset of 11,645 tokens, and the coding of variables was in some parts revised, completed, or modified. The dataset includes information about the Author and Verb Lemma, as well as a number of predictor variables, including Genre, Year, Frequency (of the verb lemma in the third-person singular), Phonological Context (stem-final sound), and the Gender of the author.</p>
提供机构:
University of Bamberg
创建时间:
2023-01-01



