CSLU: Portland Cellular Telephone Speech Version 1.3
收藏DataCite Commons2021-07-01 更新2025-04-16 收录
下载链接:
https://catalog.ldc.upenn.edu/LDC2008S01
下载链接
链接失效反馈官方服务:
资源简介:
<h3>Introduction</h3> <p>CSLU: Portland Cellular Telephone Speech Version 1.3 was created by the Center for Spoken Language Understanding (CSLU) at OGI School of Science and Engineering, Oregon Health and Science University, Beaverton, Oregon. It consists of cellular telephone speech and corresponding transcripts, specifically, 7,571 utterances from 515 speakers who made calls in the Portland, Oregon area using cellular telephones.</p> <p>Speakers called the CSLU data collection system on cellular telephones, and they were asked to repeat certain phrases and to respond to other prompts. Two prompt protocols were used: an In Vehicle Protocol for speakers calling from inside a vehicle and a Not in Vehicle Protocol for those calling from outside a vehicle. The protocols shared several questions, but each protocol contained distinct queries designed to probe the conditions of the caller's in vehicle/not in vehicle surroundings. Not every caller provided a response to each prompt. </p> <h3>Recording Details</h3> <p>The speeech data was captured digitally from CSLU's T1 connection and saved as 8 khz, 16-bit linear.</p> <h3>Transcriptions</h3> <p>The text transcriptions in this corpus were produced using the non time-aligned word-level conventions described in The CSLU Labeling Guide, which is included in the documentation for this release. CSLU: Portland Cellular Telephone Speech Version 1.3 contains orthographic and phonetic transcriptions of corresponding speech files. Non time-aligned orthographic transcriptions provide quick access to the content of an utterance; they may contain markers for word boundaries to support access and retrieval at the lexical level. Phonetic/phonemic transcriptions represent the phonetic content of an utterance at a given level of detail that is made explicit by the use of diacritics. Phonetic phenomena transcribed includes excessive nasalization, glottalization, frication on a stop, centralization, lateralization, rounding and palatalization.</p> <h3>Samples</h3> <p>For an example of the data in this corpus, please examine the following audio file and transcript. </p><ul> <li><a href="./desc/addenda/LDC2008S01.wav" rel="nofollow">audio(wav)</a></li> <li><a href="./desc/addenda/LDC2008S01.txt" rel="nofollow">transcript</a></li> </ul> </br>
Portions © 1995, 1998, 2000, 2002 Center for Spoken Language Understanding, Oregon Health & Science University, © 2008 Trustees of the University of Pennsylvania
提供机构:
Linguistic Data Consortium
创建时间:
2020-11-30



