Relative uniqueness by quartile with Z tests.
收藏NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://figshare.com/articles/dataset/Relative_uniqueness_by_quartile_with_Z_tests_/26141282
下载链接
链接失效反馈官方服务:
资源简介:
This study investigates the phenomena of semantic drift through the lenses of language and situated simulation (LASS) and the word frequency effect (WFE) within a timed word association task. Our primary objectives were to determine whether semantic drift can be identified over the short time (25 seconds) of a free word association task (a predicted corollary of LASS), and whether more frequent terms are generated earlier in the process (as expected due to the WFE). Respondents were provided with five cue words (tree, dog, quality, plastic and love), and asked to write as many associations as they could. We hypothesized that terms generated later in the task (fourth time quartile, the last 19–25 seconds) would be semantically more distant (cosine similarity) from the cue word than those generated earlier (first quartile, the first 1–7 seconds), indicating semantic drift. Additionally, we explored the WFE by hypothesizing that earlier generated words would be more frequent and less diverse. Utilizing a dataset matched with GloVe 300B word embeddings, BERT and WordNet synsets, we analysed semantic distances among 1569 unique term pairs for all cue words across time. Our results supported the presence of semantic drift, with significant evidence of within-participant, semantic drift from the first to fourth time (LASS) and frequency (WFE) quartiles. In terms of the WFE, we observed a notable decrease in the diversity of terms generated earlier in the task, while more unique terms (greater diversity and relative uniqueness) were generated in the 4th time quartile, aligning with our hypothesis that more frequently used words dominate early stages of a word association task. We also found that the size of effects varied substantially across cues, suggesting that some cues might invoke stronger and more idiosyncratic situated simulations. Theoretically, our study contributes to the understanding of LASS and the WFE. It suggests that semantic drift might serve as a scalable indicator of the invocation of language versus simulation systems in LASS and might also be used to explore cognition within word association tasks more generally. The findings also add a temporal and relational dimension to the WFE. Practically, our research highlights the utility of word association tasks in understanding semantic drift and the diffusion of word usage over a sub-minute task, arguably the shortest practically feasible timeframe, offering a scalable method to explore group and individual changes in semantic relationships, whether via the targeted diffusion of influence in a marketing campaign, or seeking to understand differences in cognition more generally. Possible practical uses and opportunities for future research are discussed.
本研究借助语言与情境模拟(language and situated simulation, LASS)与词频效应(word frequency effect, WFE)两种视角,探究了计时词联想任务中的语义漂移现象。本研究的核心目标有二:其一,验证能否在时长仅25秒的自由词联想任务中观测到语义漂移(这是语言与情境模拟框架的一项预测推论);其二,验证高频词汇是否会在联想过程中更早生成(这符合词频效应的预期)。研究人员为被试提供5个提示词:树、狗、质量、塑料与爱,并要求被试尽可能多地写出相关联想词汇。我们提出如下假设:相较于任务早期生成的词汇(第一时间四分位,即第1至7秒),任务后期生成的词汇(第四时间四分位,即第19至25秒)与提示词的余弦相似度(cosine similarity)更低,即语义距离更远,这一结果将证明语义漂移的存在。此外,我们针对词频效应展开探索,假设早期生成的词汇频率更高、多样性更低。本研究使用匹配了GloVe 300B词嵌入(GloVe 300B word embeddings)、BERT与WordNet同义词集(WordNet synsets)的数据集,针对所有提示词在不同时段的1569组独特词对开展了语义距离分析。研究结果证实了语义漂移的存在,有充分证据表明,从第一到第四时间四分位(对应语言与情境模拟框架)以及从第一到第四频率四分位(对应词频效应),被试内部均出现了显著的语义漂移。就词频效应而言,我们观察到任务早期生成的词汇多样性显著降低,而第四时间四分位中生成的独特词汇更多(多样性与相对独特性更高),这与“高频词汇主导词联想任务早期阶段”的假设相符。此外我们还发现,效应量因提示词的不同存在显著差异,这表明部分提示词可能会唤起更强、更具个体特异性的情境模拟。从理论层面来看,本研究增进了学界对语言与情境模拟框架及词频效应的理解。研究表明,语义漂移或可作为语言与情境模拟框架中语言系统与模拟系统激活程度的可扩展指标,也可用于更广泛地探索词联想任务中的认知过程。此外,本研究还为词频效应增添了时间与关系维度。从实践层面来看,本研究凸显了词联想任务在理解语义漂移以及亚分钟级任务(即目前实际可行的最短时长)内词汇使用扩散现象中的应用价值。该研究提供了一种可扩展的方法,用于探索群体与个体的语义关系变化——无论是用于营销活动中精准的影响力扩散,还是更广泛地探究认知差异。最后,本文讨论了本研究可能的实际应用场景与未来研究方向。
创建时间:
2024-07-01



