Articulation Index

Name: Articulation Index
Creator: Linguistic Data Consortium
Published: 2021-07-01 16:17:22
License: 暂无描述

DataCite Commons2021-07-01 更新2025-04-16 收录

下载链接：

https://catalog.ldc.upenn.edu/LDC2005S22

下载链接

链接失效反馈

官方服务：

资源简介：

<h3>Introduction</h3><br> <p>Articulation Index was developed by the Linguistic Data Consortium (LDC)  and was partly inspired by the work of Harvey Fletcher, who performed a number of perceptual experiments involving English syllables during the first half of the 20th century. His term <em>articulation index</em> meant something like perceptual index of syllables, where those syllables were not necessarily words, and reflected how well speakers could correctly identify syllables in the presence of noise. This corpus was created to facilitate similar experiments, as well as to potentially facilitate new methods in speech recognition research.</p><br> <p>The basic concept behind the corpus was to record speakers pronouncing syllables of English, some of which might be real words, but most of which are nonsense syllables. The goal was to have each speaker say a set of 2,000 syllables common to all speakers, as well as a set of 20 syllables unique to that speaker.</p><br> <p>LDC has also released Articulation Index LSCP (<a href="../../../LDC2015S12">LDC2015S12</a>)</p><br> <h3>Data</h3><br> <p>This release contains recordings of 20 American English speakers (12 males, 8 females) saying 2005 common syllables, 1845 of which are common to all speakers, and 400 unique syllables (20 syllables/ speaker).</p><br> <p>The recordings were made in small, sound-treated anechoic room at LDC. The speakers wore two microphones: a Sennheiser 410 headset and a Nortel Liberator wireless phone headset. The Sennheiser's signal traveled through a Symetrix 302 Dual Microphone Preamp, Sony PCM-R300 DAT deck and Townshend Datlink to a Sun Sparcserver 20 where it was written to disk at 16 KHz, 16-bit, pcm data. The Nortel's signal was transmitted to a wireless base station at a telephone connected via the network to LDC's telephone recording platform where it was caputred to disk as 8 KHz, 8-bit, u-law data.</p><br> <p>The speakers were prompted via a computer interface that displayed one prompt at a time, allowing them to iterate through the prompts by pressing a "next" button. Each recording session lasted approximately 15 minutes.</p><br> <h3>Samples</h3><br> <p>For an example of this corpus, please review this <a href="desc/addenda/LDC2005S22.wav" rel="nofollow">audio sample</a>.</p></br> © 2005 Trustees of the University of Pennsylvania

提供机构：

Linguistic Data Consortium

创建时间：

2020-11-30

5,000+

优质数据集

54 个

任务类型

进入经典数据集