CALLFRIEND Mandarin Chinese-Mainland Dialect

Name: CALLFRIEND Mandarin Chinese-Mainland Dialect
Creator: Linguistic Data Consortium
Published: 2021-07-01 16:36:55
License: 暂无描述

DataCite Commons2021-07-01 更新2024-07-13 收录

下载链接：

https://catalog.ldc.upenn.edu/LDC96S55

下载链接

链接失效反馈

官方服务：

资源简介：

<h3>Introduction</h3><br> <p>CALLFRIEND Mandarin Chinese-Mainland Dialect was developed by the Linguistic Data Consortium (LDC) and consists of approximately 24 hours of unscripted telephone conversations between native speakers of the Mandarin Chinese dialect spoken in mainland China.</p><br> <p>The CALLFRIEND series is a collection of telephone conversations in several languages conducted by LDC in support of language identification technology development. Languages covered in the collection include American English, Canadian French, Egyptian Arabic, Farsi, German, Hindi, Japanese, Korean, Mandarin Chinese, Spanish, Tamil and Vietnamese.</p><br> <p>An updated edition of this corpus is available as CALLFRIEND Mandarin Chinese-Mainland Dialect Second Edition (<a href="../../../LDC2018S09">LDC2018S09</a>). The second edition updates the audio files to wav format, simplifies the directory structure and adds documentation and metadata.</p><br> <h3>Data</h3><br> <p>The corpus consists of 60 unscripted telephone conversations, lasting between 5-30 minutes. The corpus also includes documentation describing speaker information (sex, age, education, callee telephone number) and call information (channel quality, number of speakers).</p><br> <p>For each conversation, both the caller and callee are native speakers of Mandarin Chinese from Mainland China. All calls are domestic and were placed inside the continental United States and Canada.</p><br> <p>Callers in the "Mainland" and "Taiwan" collections of CALLFRIEND Mandarin were identified primarily on the basis of specific attributes in their speech characteristic of geographic origin.</p><br> <h3>Updates</h3><br> <p>There are no updates at this time.</p></br> Portions © 1996 Trustees of the University of Pennsylvania

<h3>引言</h3><br><p>CALLFRIEND 普通话-大陆方言（Mandarin Chinese-Mainland Dialect）由语言数据联盟（Linguistic Data Consortium, LDC）开发，包含约24小时的无脚本电话对话，对话双方均为中国大陆普通话母语使用者。</p><br><p>CALLFRIEND系列是语言数据联盟（LDC）为支持语言识别技术研发而收集的多语种电话对话语料合集，涵盖的语言包括美式英语、加拿大法语、埃及阿拉伯语、波斯语、德语、印地语、日语、韩语、普通话、西班牙语、泰米尔语及越南语。</p><br><p>该语料库的更新版本为《CALLFRIEND 普通话-大陆方言第二版》（<a href="../../../LDC2018S09">LDC2018S09</a>）。第二版将音频文件更新为wav格式，简化了目录结构，并补充了文档与元数据。</p><br><h3>数据</h3><br><p>该语料库包含60段无脚本电话对话，单段时长介于5至30分钟之间。语料库还附带描述说话者信息（性别、年龄、受教育程度、被叫方电话号码）以及通话信息（信道质量、说话者人数）的文档。</p><br><p>每段对话的主叫方与被叫方均为中国大陆普通话母语使用者，所有通话均为国内通话，且均在美国本土及加拿大境内拨打。</p><br><p>CALLFRIEND普通话语料库的“大陆”与“台湾”子集的说话者，主要通过其语音中体现地域特征的特定属性进行区分。</p><br><h3>更新说明</h3><br><p>目前暂无更新内容。</p><br>本部分内容©1996 宾夕法尼亚大学理事会

提供机构：

Linguistic Data Consortium

创建时间：

2020-11-30

搜集汇总

数据集介绍