Levantine Arabic QT Training Data Set 5, Speech

Name: Levantine Arabic QT Training Data Set 5, Speech
Creator: Linguistic Data Consortium
Published: 2021-07-01 16:18:22
License: 暂无描述

DataCite Commons2021-07-01 更新2025-04-16 收录

下载链接：

https://catalog.ldc.upenn.edu/LDC2006S29

下载链接

链接失效反馈

官方服务：

资源简介：

<h3>Introduction</h3><br> <p>Levantine Arabic QT Training Data Set 5, Speech contains 1,660 calls totalling approximately 250 hours of telephone conversation in Levantine Arabic. These calls were collected between 2003 and 2005. Corresponding transcriptions may be found in <a href="../../../LDC2006T07">LDC2006T07</a>.</p><br> <h3>Data</h3><br> <p>This corpus is the combination of four former training data sets: LDC2004E21 and LDC2004E22, LDC2004E65 and LDC2004E66, LDC2005S07 and LDC2005T03 and LDC2005S14 (Speech and Transcripts). More than half of the speakers are Lebanese, the others are Jordanian, Palestinian, and Syrian. The table below shows the distribution of the speakers' national origin:</p><br> <ul><br> <li>559 Jordanian</li><br> <li>1,853 Lebanese</li><br> <li>355 Palestinian</li><br> <li>67 Syrian</li><br> <li>484 Levantine speakers whose national origin could not be determined.</li><br> </ul><br> <h3>Samples</h3><br> <p>For an example of this corpus, please listen to this <a href="desc/addenda/LDC2006S29.wav" rel="nofollow">audio sample</a> in sphere format.</p><br> <h3>Updates</h3><br> <p>Eight sphere files were found to have incorrect headers. As of October, 2017 these files have been fixed. All copies downloaded after this date will have the correct files.</p></br> Portions © 2003-2006 Trustees of the University of Pennsylvania

提供机构：

Linguistic Data Consortium

创建时间：

2020-11-30

5,000+

优质数据集

54 个

任务类型

进入经典数据集