Gulf Arabic Conversational Telephone Speech
收藏DataCite Commons2021-07-01 更新2025-04-16 收录
下载链接:
https://catalog.ldc.upenn.edu/LDC2006S43
下载链接
链接失效反馈官方服务:
资源简介:
<h3>Introduction</h3> <p> This database contains 975 Gulf Arabic speakers taking part in spontaneous telephone conversations in Colloquial Gulf Arabic. A total of 976 conversation sides are provided (one speaker appears on two distinct calls). The average duration per side is about 5.7 minutes.</p> <p>This corpus was collected and transcribed in 2004 by Appen Pty Ltd (Appen), Sydney, Australia.</p> <h3>Data</h3> <p>The single-channel files represent just one side of a normal conversation. The "devtest" set represents a relatively balanced (representative) sample drawn from the total pool of collected calls, based on a test-set selection process applied by the National Institute of Standards and Technology (NIST) and based on demographic, phone and audit information as provided by Appen.</p> <h3>Samples</h3> <p> For an example of the data contained in this corpus, please listen to this <a href="./desc/addenda/LDC2006S43.wav" rel="nofollow">audio sample(wav)</a>. </p> </br>
Portions © 2006 Trustees of the University of Pennsylvania
<h3>简介</h3> <p> 本数据库收录了975名使用海湾阿拉伯口语(Colloquial Gulf Arabic)的受访者参与的自发电话对话语料。本次共提供976条通话单边样本(其中1名受访者参与了两次独立通话),单条样本的平均时长约为5.7分钟。</p> <p>该语料库于2004年由澳大利亚悉尼的Appen Pty Ltd(简称Appen)完成采集与转写工作。</p> <h3>数据</h3> <p>单声道音频文件仅对应单次正常对话的单边通话内容。其中的“devtest”集合是从全部采集通话池中抽取的均衡代表性样本,其筛选流程遵循美国国家标准与技术研究院(National Institute of Standards and Technology,NIST)制定的测试集选择规则,并结合了Appen提供的人口统计、通话记录及审核信息。</p> <h3>示例样本</h3> <p> 如需查看本语料库的样本示例,请收听此<a href="./desc/addenda/LDC2006S43.wav" rel="nofollow">音频样本(wav格式)</a>。 </br>
Portions © 2006 宾夕法尼亚大学托管委员会
提供机构:
Linguistic Data Consortium
创建时间:
2020-11-30



