CSLU: Multilanguage Telephone Speech Version 1.2

Name: CSLU: Multilanguage Telephone Speech Version 1.2
Creator: Linguistic Data Consortium
Published: 2021-07-01 16:18:30
License: 暂无描述

DataCite Commons2021-07-01 更新2025-04-16 收录

下载链接：

https://catalog.ldc.upenn.edu/LDC2006S35

下载链接

链接失效反馈

官方服务：

资源简介：

<h3>Introduction</h3> <p>The Multilanguage Telephone Speech corpus consists of telephone speech from 11 languages: English, Farsi, French, German, Hindi, Japanese, Korean, Mandarin, Spanish, Tamil, Vietnamese. The corpus contains fixed vocabulary utterances (eg. days of the week) as well as fluent continuous speech. The current release includes recorded utterances from about 2,052 speakers, for a total of about 38.5 hours of speech. Time-aligned phonetic transcriptions for 619 of the utterances are also included. </p><h3>Data</h3> <p>Each subject called the CSLU data collection system by dialing a toll-free number. An analog telephone line was connected to a Gradient Technologies box. Data from incoming calls were recorded by the Gradient box. The sampling rate was 8 khz and the files were stored in 16-bit linear format on a UNIX file system. Each utterance was recorded as a separate file.</p><h3>Samples</h3> <p>For an example of the data in this corpus, please listen to these audio samples in <a href="./desc/addenda/LDC2006S35_Tam.wav" rel="nofollow">Tamil</a> and <a href="./desc/addenda/LDC2006S35_Eng.wav" rel="nofollow">English</a>. </p></br> Portions © 1992, 2000, 2002 Center for Spoken Language Understanding, Oregon Health & Science University, © 2006 Trustees of the University of Pennsylvania

提供机构：

Linguistic Data Consortium

创建时间：

2020-11-30

5,000+

优质数据集

54 个

任务类型

进入经典数据集