CALLFRIEND American English-Southern Dialect Second Edition
收藏DataCite Commons2020-08-17 更新2025-04-16 收录
下载链接:
https://catalog.ldc.upenn.edu/LDC2020S08
下载链接
链接失效反馈官方服务:
资源简介:
Introduction
<br><br>
CALLFRIEND American English-Southern Dialect Second Edition was developed by LDC and consists of approximately 26 hours of unscripted telephone conversations between native speakers of Southern dialects of American English. This second edition updates the audio files to wav format, simplifies the directory structure and adds documentation and metadata. The first edition is available as CALLFRIEND American English-Southern Dialect (LDC96S47).
<br><br>
The CALLFRIEND series is a collection of telephone conversations in several languages conducted by LDC in support of language identification technology development. Languages covered in the collection include American English, Canadian French, Egyptian Arabic, Farsi, German, Hindi, Japanese, Korean, Mandarin Chinese, Spanish, Tamil and Vietnamese.
Data
<br><br>
All data was collected before July 1997. Participants could speak with a person of their choice on any topic; most called family members and friends. All calls originated in North America. The recorded conversations last up to 30 minutes.
<br><br>
The data was recorded as 8kHz u-law SPH encoded stereo files, with one end of the phone call on each channel. In this release, files were converted to WAV format, and information from the original SPH headers is described in the documentation. SPH files are not included in this second edition.
<br><br>
The audio files were originally split into train, dev and test folders of 20 recordings each, but they are combined in this release.
<br><br>
Completed calls passed through a human auditing process to verify that the target language was spoken by the participants, to check the quality of the recordings, and to record information about dialect, noise and distortion.
Samples
<br><br>
Please view this audio sample.
Updates
<br><br>
None at this time.
Copyright
Portions © 1996, 1997, 2020 Trustees of the University of Pennsylvania
Introduction<br><br>CALLFRIEND美国英语-南部方言第二版由LDC开发,包含约26小时美国英语南部方言母语者之间的无脚本电话对话。该第二版将音频文件更新为WAV格式,简化了目录结构,并添加了文档和元数据。第一版可通过CALLFRIEND美国英语-南部方言(LDC96S47)获取。<br><br>CALLFRIEND系列是LDC为支持语言识别技术开发而收集的多语言电话对话集。该系列涵盖的语言包括美国英语、加拿大法语、埃及阿拉伯语、波斯语、德语、印地语、日语、韩语、普通话、西班牙语、泰米尔语和越南语。<br><br>Data<br><br>所有数据均在1997年7月前收集。参与者可与任意选择的人就任何话题交谈;大多数人致电家人和朋友。所有通话均起源于北美。录制的对话最长可达30分钟。<br><br>数据最初录制为8kHz u-law SPH编码立体声文件,电话通话的两端各占一个声道。在本版本中,文件已转换为WAV格式,原始SPH头文件中的信息在文档中有描述。本第二版不包含SPH文件。<br><br>音频文件最初分为各含20条录音的训练集(train)、开发集(dev)和测试集(test)文件夹,但在本版本中已合并。<br><br>已完成的通话经过人工审核流程,以验证参与者是否使用目标语言,检查录音质量,并记录有关方言、噪声和失真的信息。<br><br>Samples<br><br>请查看此音频样本。<br><br>Updates<br><br>目前无更新。<br><br>Copyright<br><br>部分内容 © 1996、1997、2020 宾夕法尼亚大学董事会
提供机构:
Linguistic Data Consortium
创建时间:
2020-08-06



