1999 HUB4 Broadcast News Evaluation English Test Material
收藏DataCite Commons2021-07-01 更新2025-04-16 收录
下载链接:
https://catalog.ldc.upenn.edu/LDC2000S88
下载链接
链接失效反馈官方服务:
资源简介:
<h3>Introduction</h3><br>
<p>This publication contains the English evaluation test material used in the 1999 NIST Broadcast News Transcription Evaluation administered by the <a href="http://www.nist.gov/speech" rel="nofollow">NIST, Spoken Natural Language Processing Group </a> and produced by the <a href="http://www.ldc.upenn.edu" rel="nofollow">Linguistic Data Consortium</a>Catalog number LDC2000S88 ISBN 1-58563-176-0.</p><br>
<h3>Data</h3><br>
<p>The test material is contained in two SPHERE-formatted waveform files. The file bn99en_1.sph (set1) contains 1.5 hours of Broadcast News excerpts from last year's set2 epoch. The file bn99en_2.sph (set2) contains 1.5 hours of Broadcast News excerpts from the summer of 1998. Each file should be separately recognized per the <a href="http://www.itl.nist.gov/iaui/894.01/tests/bnr/bnews_99/bnews_99.htm" rel="nofollow">Broadcast News English Evaluation Specification</a>.</p><br>
<p>Additional test material for each set is also included. Test materials include evaluation map files (<a href="../../../Catalog/desc/addenda/LDC2000S88_1.uem" rel="nofollow">bn99en_1.uem</a>), automatically generated segmentation files (<a href="../../../Catalog/desc/addenda/LDC2000S88_2.seg" rel="nofollow">bn99en_1.seg</a>), transcripts from the evaluation (<a href="../../../Catalog/desc/addenda/LDC2000S88_3.utf" rel="nofollow">bn99en_1.utf</a>) and the <a href="../../../Catalog/desc/addenda/LDC2000S88_4.dtd" rel="nofollow">utf.dtd</a> used to validate the transcripts, reference STM files (<a href="../../../Catalog/desc/addenda/LDC2000S88_5.stm" rel="nofollow">bn99en_1.stm</a>), and transcript orthography mapping files (<a href="../../../Catalog/desc/addenda/LDC2000S88_6.glm" rel="nofollow">en981118.glm</a>). For more complete information, see the <a href="http://www.nist.gov/speech" rel="nofollow">1998 HUB4 Website</a>.</p><br>
<h3>Updates</h3><br>
<p>There are no updates at this time.</p><br>
<p><em>Note that the waveform and transcript data on this disc are licensed through the <a href="http://www.ldc.upenn.edu" rel="nofollow">Linguistic Data Consortium (LDC)</a> and are subject to usage restrictions. Contact the <a rel="nofollow">LDC</a> for license agreement information.</em></p><br>
<h3>Additional Licensing Instructions</h3><br>
<p>This 'members-only' corpora is available to current members who can request the data at the listed reduced-license fee. Contact <a href="mailto:ldc@ldc.upenn.edu">ldc@ldc.upenn.edu</a> for information about becoming a member.</p></br>
Portions Copyright 1998 PRI-Public Radio International Portions Copyright 1997-1998 ABC News Portions Copyright 1998 NBC News Portions Copyright 1997-1998 Cable News Network, Inc. All Rights Reserved <br><br><i>Note that the waveform and transcript data on this disc are licensed through the <a href="http://www.ldc.upenn.edu" rel="nofollow">Linguistic Data Consortium (LDC)</a> and are subject to usage restrictions. Contact the <a rel="nofollow">LDC</a> for license agreement information.</i>
提供机构:
Linguistic Data Consortium
创建时间:
2020-11-30



