Structure in talker variability: How much is there and how much can it help?
收藏osf.io2018-07-10 更新2025-03-26 收录
下载链接:
https://osf.io/3wcy5
下载链接
链接失效反馈官方服务:
资源简介:
One of the persistent puzzles in understanding human speech perception is how listeners cope with talker variability. One thing that might help listeners is structure in talker variability: rather than varying randomly, talkers of the same gender, dialect, age, etc. tend to produce language in similar ways. Sociolinguistic research has shown that listeners are sensitive to this covariation between linguistic variation and socio-indexical variables. In this paper I present new techniques based on ideal observer models to quantify 1) the amount and type of structure in talker variation, and 2) how useful such structure can be for robust speech recognition in the face of talker variability. I demonstrate these techniques in two phonetic domains---word-initial stop voicing and vowel identity---and show that these domains have different amounts and types of talker variability, consistent with previous, impressionistic findings. An `R` package accompanies this paper, enabling researchers to apply these techniques to their own data.
在理解人类语音感知的过程中,如何应对说话者的变异性始终是一个经久不衰的难题。对于听众而言,说话者变异性的结构化可能有助于他们应对这一挑战:与随机变化相比,同一性别、方言、年龄等特征的说话者往往以相似的方式产生语言。社会语言学研究已表明,听众对语言变异与社交指标变量之间的这种共变现象非常敏感。在本文中,我基于理想观察者模型提出了新的技术方法,旨在量化1)说话者变异性的数量和类型,以及2)这种结构对于应对说话者变异性在稳健语音识别中的有用性。我在两个语音学领域——词首塞音的发音和元音身份——中展示了这些技术方法,并表明这些领域具有不同数量和类型的说话者变异性,这与先前的直观发现相一致。本文附带了一个 `R` 软件包,使研究者能够将这些技术应用于自己的数据。
提供机构:
osf.io



