five

Chinese Onomatopoeia Database (COD): Concreteness, imageability, age of acquisition, familiarity, semantic transparency, context availability, emotional valence, and emotional arousal for Chinese onomatopoeic words

收藏
DataCite Commons2025-04-27 更新2025-05-18 收录
下载链接:
https://www.scidb.cn/detail?dataSetId=e63a6a64adc84c81abf4d772622c9e01
下载链接
链接失效反馈
官方服务:
资源简介:
While numerous lexical databases provide rating norms for a wide range of words, resources for onomatopoeia remain scarce. Given the pivotal role of onomatopoeia in language development and its potential insights for the relationship between word phonology and word meaning, we introduce Chinese Onomatopoeia Database (COD), comprising 97 one-character, 380 two-character, 91 three-character, and 183 four-character onomatopoeic words in Chinese (total N = 751). All words were rated by 311 native Chinese speakers for concreteness, imageability, age of acquisition (AoA), familiarity, semantic transparency, context availability, emotional valence, and emotional arousal. We demonstrated high reliability across these measures through Cronbach’s alpha, split-half coefficients, and intra-class correlation coefficients (ICCs). Correlation analyses revealed significant associations among these lexical variables, including those between semantic and affective variables. Predictive validity of these variables was also examined using normed reaction times (RTs) obtained based on two-character onomatopoeic words in a lexical decision experiment (N = 79), which showed that all COD variables significantly predicted lexical decision RTs, despite with varied predictive power. Further analyses with two measures, Zipf and logCD, from Chinese Children’s Lexicon of Written Words (CCLOWW) showed that these measures were significantly correlated with all COD variables. The inclusion of the Zipf measure in the regression models notably enhanced the predictability of lexical decision RTs. The establishment of the COD not only fills a crucial gap in psycholinguistic resource, but also provides a robust tool for future research into the cognitive and developmental underpinnings of language processing.

尽管现有诸多词汇数据库已为大量词汇提供了评分常模,但针对拟声词(onomatopoeia)的相关资源仍较为匮乏。鉴于拟声词在语言发展中具有核心作用,且其可为揭示词汇音系与词汇语义间的关联提供重要研究视角,我们构建了中文拟声词数据库(Chinese Onomatopoeia Database,简称COD),该库收录了751条中文拟声词,其中单字式97条、双字式380条、三字式91条、四字式183条。所有条目均由311名汉语母语者进行评分,评分维度包括具体性、意象性、习得年龄(age of acquisition,简称AoA)、熟悉度、语义透明度、语境可及性、情绪效价与情绪唤醒度。我们通过克朗巴哈α系数、分半系数以及组内相关系数(intra-class correlation coefficients,简称ICCs)验证了各评分维度均具备良好的信度。相关分析显示,各词汇变量间存在显著关联,其中包括语义变量与情感变量之间的关联。我们还基于一项针对双字拟声词的词汇判断实验(有效样本量N=79)所获得的标准化反应时(reaction times,简称RTs),检验了各变量的预测效度。结果表明,尽管各COD变量的预测能力存在差异,但所有变量均能显著预测词汇判断反应时。进一步采用中文儿童书面语词汇库(Chinese Children’s Lexicon of Written Words,简称CCLOWW)中的两项指标——齐普夫(Zipf)指数与logCD指标进行分析,结果显示这两项指标与所有COD变量均存在显著相关。将齐普夫指数纳入回归模型后,词汇判断反应时的预测能力得到了显著提升。COD的构建不仅填补了心理语言学资源领域的一项重要空白,同时也为未来探索语言加工的认知与发展机制提供了可靠的研究工具。
提供机构:
Science Data Bank
创建时间:
2024-10-22
搜集汇总
数据集介绍
main_image_url
背景与挑战
背景概述
该数据集是一个中文象声词数据库(COD),包含751个象声词,覆盖1到4个字符长度,并由母语者从8个心理语言学维度(如具体性、情感效价等)进行评分。数据集具有高可靠性,能有效预测词汇决策反应时间,填补了象声词资源空白,为语言认知和发展研究提供了工具。
以上内容由遇见数据集搜集并总结生成
二维码
社区交流群
二维码
科研交流群
商业服务