2019 NIST Speaker Recognition Evaluation Test Set -- CTS Challenge
收藏DataCite Commons2025-05-06 更新2024-07-13 收录
下载链接:
https://catalog.ldc.upenn.edu/LDC2023S03
下载链接
链接失效反馈官方服务:
资源简介:
<h3>Introduction</h3>
<p>2019 NIST Speaker Recognition Evaluation Test Set -- CTS Challenge was developed by the Linguistic Data Consortium (LDC) and NIST (National Institute of Standards and Technology). It contains approximately 635 hours of Tunisian Arabic telephone recordings for development and test, answer keys, enrollment, trial files and documentation from the CTS Challenge portion of the NIST-sponsored <a href="https://www.nist.gov/itl/iad/mig/nist-2019-speaker-recognition-evaluation">2019 Speaker Recognition Evaluation (SRE)</a>.</p>
<p>The ongoing series of SRE yearly evaluations conducted by NIST are intended to be of interest to researchers working on the general problem of text independent speaker recognition. To this end the evaluations are designed to be simple, to focus on core technology issues, to be fully supported and to be accessible to those wishing to participate.</p>
<p>The 2019 evaluation task was speaker detection, that is, to determine whether a specified target speaker was speaking during a segment of speech. The evaluation was conducted in two parts: (1) a leaderboard-style challenge based on conversational telephone speech from LDC's Call My Net 2 (CMN2) corpus; and (2) a separate evaluation using audio-visual material collected by LDC for the VAST (Video Annotation for Speech Technology) project. Further information about the evaluation is contained in the evaluation plan included in this release.</p>
<h3>Data</h3>
<p>The telephone speech data for the CTS Challenge was drawn from the CMN2 collection conducted by LDC in Tunisia in which Tunisian Arabic speakers called friends or relatives who agreed to record their telephone conversations lasting between 8-10 minutes. The speech segments include PSTN (public switched telephone network) and VOIP (voice over IP) data. This telephone speech is presented in sphere format as 8-bit a-law with a sample rate of 8000 KHz.</p>
<h3>Samples</h3>
<p>Please see the following <a href="desc/addenda/LDC2023S03.sph">audio sample</a>.</p>
<h3>Updates</h3>
<p>None at this time.</p>
<h3>简介</h3>
<p>2019年NIST说话人识别评测测试集——CTS挑战赛数据集,由语言数据联盟(Linguistic Data Consortium, LDC)与美国国家标准与技术研究院(National Institute of Standards and Technology, NIST)共同研发。该数据集包含约635小时的突尼斯阿拉伯语电话录音(用于开发与测试)、参考答案、注册音频、评测文件以及由NIST赞助的<a href="https://www.nist.gov/itl/iad/mig/nist-2019-speaker-recognition-evaluation">2019年说话人识别评测(Speaker Recognition Evaluation, SRE)</a>中CTS挑战赛部分的相关文档。</p>
<p>NIST主办的SRE年度评测系列活动,旨在为从事与文本无关说话人识别通用问题研究的科研人员提供支持。为此,该类评测设计简洁,聚焦核心技术问题,提供全方位支持且便于有意参与的研究者加入。</p>
<p>2019年评测任务为说话人检测,即判断指定目标说话人是否在某段语音片段中发声。本次评测分为两部分:(1) 基于语言数据联盟的Call My Net 2(CMN2)语料库中的会话电话语音开展的排行榜式挑战赛;(2) 采用语言数据联盟为VAST(语音技术视频标注,Video Annotation for Speech Technology)项目收集的音视频素材开展的独立评测。本次评测的更多细节可参见本发布包中包含的评测方案文档。</p>
<h3>数据</h3>
<p>CTS挑战赛所用的电话语音数据源自语言数据联盟在突尼斯采集的CMN2语料库,该语料库收录了突尼斯阿拉伯语使用者致电同意录制时长8至10分钟电话通话的亲友的通话内容。语音片段涵盖公用交换电话网(public switched telephone network, PSTN)与IP语音(voice over IP, VOIP)数据。该电话语音以sphere格式存储,采用8位a-law编码,采样率为8000 kHz。</p>
<h3>样例</h3>
<p>请查看以下<a href="desc/addenda/LDC2023S03.sph">音频样例</a>。</p>
<h3>更新记录</h3>
<p>暂无更新。</p>
提供机构:
Linguistic Data Consortium
创建时间:
2023-05-16



