CSLU: Speaker Recognition Version 1.1
收藏DataCite Commons2021-07-01 更新2025-04-16 收录
下载链接:
https://catalog.ldc.upenn.edu/LDC2006S26
下载链接
链接失效反馈官方服务:
资源简介:
<h3>Introduction</h3> <p>This file contains documentation on the CSLU Speaker Recognition Corpus, Version 1.1, Linguistic Data Consortium (LDC) catalog number LDC2006S26 and ISBN 1-58563-382-8. </p><p>The Speaker Recognition corpus (formerly known as Speaker Verification), consists of telephone speech from 91 participants. Each participant has recorded speech in twelve sessions over a two-year period answering questions like "what is your eye color" or responding to prompts like "describe a typical day in your life." Most of the utterances in the release of the corpus have corresponding non-time-aligned word level transcriptions. </p><p> In most of the CSLU data collections, each participant calls a toll free telephone number and answers a few question. CSLU records the speech, transcribes it, then packages it as a released corpus. </p><p> The Speaker Recognition data collection was quite a bit more complicated. The goal of the data collection was to collect speech from each participant over a two-year period. Each participant called call the data collection system 12 times over the two-year period and say the same utterances each time. </p><p> Some of the recording sessions were only a few days apart and others several weeks apart. Participant followed the following calling schedule. During the first month, they called twice in a week. No calls were made in the second and third months. In the fourth month they made one call. No calls were made in the fifth and sixth months. This pattern repeated three more times for a total of 12 calls per participant.</p><p> In order to balance the workload required to remind participants to call and to avoid large data collection bursts on the system, the participants were divided into 12 groups. Each group began the two-year schedule on subsequent months. The first group started in September 1996. The second group started in October 1996. And so on. </p><h3>Samples</h3> <p>For an example of the data in this corpus, please listen to the following audio <a href="./desc/addenda/LDC2006S26.wav" rel="nofollow">sample</a>. </p> </br>
Portions © 1996-2002 Center for Spoken Language Understanding, Oregon Health & Science University, © 2006 Trustees of the University of Pennsylvania
提供机构:
Linguistic Data Consortium
创建时间:
2020-11-30



