Unifying Amplitude and Phase Analysis: A Compositional Data Approach to Functional Multivariate Mixed-Effects Modeling of Mandarin Chinese
收藏DataCite Commons2020-09-04 更新2024-07-25 收录
下载链接:
https://tandf.figshare.com/articles/dataset/Unifying_Amplitude_and_Phase_Analysis_A_Compositional_Data_Approach_to_Functional_Multivariate_Mixed_Effects_Modeling_of_Mandarin_Chinese/1473749/2
下载链接
链接失效反馈官方服务:
资源简介:
Mandarin Chinese is characterized by being a tonal language; the pitch (or <i>F</i><sub>0</sub>) of its utterances carries considerable linguistic information. However, speech samples from different individuals are subject to changes in amplitude and phase, which must be accounted for in any analysis that attempts to provide a linguistically meaningful description of the language. A joint model for amplitude, phase, and duration is presented, which combines elements from functional data analysis, compositional data analysis, and linear mixed effects models. By decomposing functions via a functional principal component analysis, and connecting registration functions to compositional data analysis, a joint multivariate mixed effect model can be formulated, which gives insights into the relationship between the different modes of variation as well as their dependence on linguistic and nonlinguistic covariates. The model is applied to the COSPRO-1 dataset, a comprehensive database of spoken Taiwanese Mandarin, containing approximately 50,000 phonetically diverse sample <i>F</i><sub>0</sub> contours (syllables), and reveals that phonetic information is jointly carried by both amplitude and phase variation. Supplementary materials for this article are available online.
普通话以声调语言为典型特征;语音的基频(F₀)承载着大量语言学信息。然而,不同个体的语音样本会受到幅度与相位变化的影响,任何旨在对该语言开展具有语言学意义的描述的分析,都必须考虑这一因素。本文提出一种针对幅度、相位与时长的联合模型,该模型融合了功能数据分析(functional data analysis)、组成数据分析(compositional data analysis)以及线性混合效应模型的相关方法。通过功能主成分分析(functional principal component analysis)对函数进行分解,并将配准函数与组成数据分析相结合,可构建联合多变量混合效应模型,该模型能够揭示不同变异模式间的关联,以及这些变异模式对语言学与非语言学协变量的依赖关系。该模型被应用于COSPRO-1数据集——这是一个涵盖口语台湾普通话的综合数据库,包含约50000条语音多样的基频(F₀)轮廓样本(即音节),研究结果表明,语音信息同时由幅度与相位变异共同承载。本文的补充材料可在线获取。
提供机构:
Taylor & Francis
创建时间:
2016-01-20



