Code-switching in Native Authored Children’s Books

NIAID Data Ecosystem2026-05-01 收录

下载链接：

https://data.mendeley.com/datasets/dhzvxn7rgy

下载链接

链接失效反馈

官方服务：

资源简介：

This study examined the frequency of code-switching, transversing individual lexical items from one language into the speech of another language, within Native American authored children's books then compared the frequency with that of Latino authored children's books and non-Native American and Latino authored children's books. This corpuses are comprised of 13 Native American authored children's books, 7 Latino Authored Pura Belpre Award winning children's books, and 6 non-Native American or Latino Caldecott Award winning books. The data was analyzed utilizing AntConc in the following procedure: Stage 1: N-gram identification A list of trigrams was generated by utilizing the n-gram tool in AntCon for each corpus individually. The n-gram tool identifies clusters of word tokens per “n” amount—in this case three. For an example, trigrams for “Yayah wore a scarf, too” are “Yayah wore a,” “wore a scarf,” and “a scarf, too.” Trigrams were comprised solely of word tokens; other tokens such as punctuation were excluded in the trigram count. Stage 2: Designation of the N-gram as a Code-switch The designation of the n-gram as a code-switch requires for Native American and Latino corpuses including one English word within the trigram; reference corpus requires one non-English word within the trigram. The lists of trigrams per each corpus were reviewed to identify trigrams that met the inclusion criteria. Trigrams were then reviewed, utilizing the concordance tool, to eliminate overlapping solely intersentential trigrams (e.g., “like her abuela’s,” “her abuela’s psychic,” and “abuela’s psychic sometimes”) and intersentential and intrasentential overlapping (e.g., “sang to Nibi.” And “to Nibi. They”). Trigrams inclusive solely of a definite article (e.g., el or la) were also excluded (e.g., went into la), as this did not meet the definition of CS. Stage 3: N-gram Coding The resultant trigrams after stages one and two were coded as the following: English only words (dummy coded = 0), English and one Native American identified word (dummy coded = 1), and English and Spanish identified word (dummy coded = 2). Data Analysis For RQ1, frequency of Native American CS was determined by comparing CS trigrams with total trigrams within the corpus. In terms of RQs 2–3, replication of Native American frequency of CS calculations was completed for reference and Latino corpuses. Differences in the frequency of code switching between reference and Latino corpuses versus Native American corpuses was assessed using the log likelihood ratio test (G2). The effect size employed was log ratio (LR) where log ratio of 1 meant the variable was twice as common in a corpus than it was in the comparison corpus (Hardie, 2014). The alpha level was p < .01. R was used for all analyses.

创建时间：

2024-01-15

5,000+

优质数据集

54 个

任务类型

进入经典数据集