Fisher English Training Speech Part 1 Speech

Mendeley Data2024-01-31 更新2024-06-27 收录

下载链接：

https://catalog.ldc.upenn.edu/LDC2004S13

下载链接

链接失效反馈

官方服务：

资源简介：

Introduction Fisher English Training Speech Part 1 Speech represents the first half of a collection of conversational telephone speech (CTS) that was created at the LDC during 2003. It contains 5,850 audio files, each one containing a full conversation of up to 10 minutes. Additional information regarding the speakers involved and types of telephones used can be found in the companion text corpus of transcripts, Fisher English Training Speech Part 1, Transcripts (LDC2004T19). The Fisher telephone conversation collection protocol was created at LDC to address a critical need of developers trying to build robust automatic speech recognition (ASR) systems. Previous collection protocols, such as CALLFRIEND and Switchboard-II and the resulting corpora, have been adapted for ASR research but were in fact developed for language and speaker identification respectively. Although the CALLHOME protocol and corpora were developed to support ASR technology, they feature small numbers of speakers making telephone calls of relatively long duration with narrow vocabulary across the collection. CALLHOME conversations are challengingly natural and intimate. Under the Fisher protocol, a very large number of participants each make a few calls of short duration speaking to other participants, whom they typically do not know, about assigned topics. This maximizes inter-speaker variation and vocabulary breath while also increasing formality. Previous protocols such as CALLHOME, CALLFRIEND and Switchboard relied upon participant activity to drive the collection. Fisher is unique in being platform driven rather than participant driven. Participants who wish to initiate a call may do so; however the collection platform initiates the majority of calls. Participants need only answer their phones at the times they specified when registering for the study. To encourage a broad range of vocabulary, Fisher participants are asked to speak on an assigned topic which is selected at random from a list, which changes every 24 hours and which is assigned to all subjects paired on that day. Some topics are inherited or refined from previous Switchboard studies while others were developed specifically for the Fisher protocol. Data The individual audio files are presented in NIST SPHERE format, and contain two-channel mu-law sample data; "shorten" compression has been applied to all files. Data collection and transcription were sponsored by DARPA and the U.S. Department of Defense, as part of the EARS project for research and development in automatic speech recognition. Samples Please examine this sample to see an example of the data in this corpus. © 2003-2004 Trustees of the University of Pennsylvania

# 引言 Fisher英语训练语音第一部分语料（Fisher English Training Speech Part 1 Speech）是2003年由语言数据联盟（Linguistic Data Consortium, LDC）创建的会话电话语音（conversational telephone speech, CTS）合集的前半部分。该语料包含5850条音频文件，每条均为时长不超过10分钟的完整对话。关于参与对话的说话人及所用电话类型的补充信息，可参阅配套的转写文本语料《Fisher英语训练语音第一部分转写（LDC2004T19）》。 Fisher电话对话采集协议由LDC开发，旨在满足研发鲁棒性自动语音识别（automatic speech recognition, ASR）系统的开发者们的迫切需求。此前的采集协议如CALLFRIEND、Switchboard-II及其对应的语料库，虽已适配ASR研究，但原本分别是为语言识别和说话人识别开发的。尽管CALLHOME协议及对应语料库是为支持ASR技术研发而打造的，但其存在说话人数量少、通话时长偏长、全语料词汇范围狭窄的问题。CALLHOME的对话自然且私密，极具挑战性。在Fisher协议框架下，大量参与者各自与若干通常互不相识的其他参与者进行短时通话，通话主题为指定话题。该设计最大化了说话人间的差异与词汇广度，同时提升了对话的正式程度。此前的采集协议如CALLHOME、CALLFRIEND及Switchboard均依赖参与者主动发起活动来推动采集流程，而Fisher协议的独特之处在于采用平台驱动模式，而非参与者驱动。有意发起通话的参与者可自行操作，但绝大多数通话均由采集平台主动发起。参与者仅需在注册研究时约定的时段接听电话即可。为鼓励丰富的词汇使用，Fisher项目要求参与者就从每日更新的随机话题列表中选取的指定话题进行发言，当日配对的所有受试者均使用该日分配的话题。部分话题源自此前的Switchboard研究并加以优化，其余则专为Fisher协议开发。 # 数据说明单条音频文件采用NIST SPHERE格式封装，包含双声道mu-law采样数据，所有文件均应用了shorten压缩算法。本数据集的采集与转写由美国国防高级研究计划局（Defense Advanced Research Projects Agency, DARPA）及美国国防部资助，属于自动语音识别研发项目EARS的一部分。 # 样本示例请查阅本样本以了解该语料库的数据样例。 © 2003-2004 宾夕法尼亚大学托管委员会

创建时间：

2024-01-31

搜集汇总

数据集介绍

背景与挑战

背景概述

该数据集是Fisher英语训练语音第一部分，包含984小时的英语电话对话语音，旨在支持自动语音识别研究。数据来源于5,850次最长10分钟的对话，采用ulaw采样和8000 Hz采样率，与转录文本配套使用。其特点在于采用平台驱动的收集协议，参与者讨论每日更换的指定话题，以增加词汇多样性和形式性，适用于ASR系统开发。

以上内容由遇见数据集搜集并总结生成

5,000+

优质数据集

54 个

任务类型

进入经典数据集