NIST 2003 Open Machine Translation (OpenMT) Evaluation
收藏DataCite Commons2021-07-01 更新2025-04-16 收录
下载链接:
https://catalog.ldc.upenn.edu/LDC2010T11
下载链接
链接失效反馈官方服务:
资源简介:
<h3>Introduction</h3><br>
<p>NIST 2003 Open Machine Translation (OpenMT) Evaluation is a package containing source data, reference translations, and scoring software used in the NIST 2003 OpenMT evaluation. It is designed to help evaluate the effectiveness of machine translation systems. The package was compiled and scoring software was developed by researchers at NIST, making use of newswire source data and reference translations collected and developed by LDC.</p><br>
<p>The objective of the NIST OpenMT evaluation series is to support research in, and help advance the state of the art of, machine translation (MT) technologies -- technologies that translate text between human languages. Input may include all forms of text. The goal is for the output to be an adequate and fluent translation of the original.</p><br>
<p>The MT evaluation series started in 2001 as part of the DARPA TIDES (Translingual Information Detection, Extraction) program. Beginning with the 2006 evaluation, the evaluations have been driven and coordinated by NIST as NIST OpenMT. These evaluations provide an important contribution to the direction of research efforts and the calibration of technical capabilities in MT. The OpenMT evaluations are intended to be of interest to all researchers working on the general problem of automatic translation between human languages. To this end, they are designed to be simple, to focus on core technology issues, and to be fully supported. The 2003 task was to evaluate translation from Chinese to English and from Arabic to English.</p><br>
<p>Additional information about these evaluations may be found at the <a href="https://www.nist.gov/itl/iad/mig/open-machine-translation-evaluation" rel="nofollow">NIST Open Machine Translation (OpenMT) Evaluation web site</a>.</p><br>
<h3>Scoring Tools</h3><br>
<p>This evaluation kit includes a single perl script (mteval-v09c.pl) that may be used to produce a translation quality score for one (or more) MT systems. The script works by comparing the system output translation with a set of (expert) reference translations of the same source text. Comparison is based on finding sequences of words in the reference translations that match word sequences in the system output translation. More information on the evaluation algorithm may be obtained from the paper detailing the algorithm: <a href="http://www.aclweb.org/anthology/P/P02/P02-1040.pdf" rel="nofollow">BLEU: a Method for Automatic Evaluation of Machine Translation (Papineni et al, 2002)</a>.</p><br>
<p>The included scoring script was released with the original evaluation, intended for use with SGML-formatted data files, and is provided to ensure compatibility of user scoring results with results from the original evaluation. An updated scoring software package (mteval-v13a-20091001.tar.gz), with XML support, additional options and bug fixes, documentation, and example translations, may be downloaded from the <a href="https://www.nist.gov/itl/iad/mig/tools" rel="nofollow">NIST Multimodal Information Group Tools</a> website.</p><br>
<h3>Data</h3><br>
<p>The Chinese-language and Arabic-language source text included in this corpus is a reorganization of data that was initially released to the public respectively as <a href="http://catalog.ldc.upenn.edu/LDC2006T04" rel="nofollow">Multiple-Translation Chinese (MTC) Part 4 (LDC2006T04)</a> and <a href="http://catalog.ldc.upenn.edu/LDC2005T05" rel="nofollow">Multiple-Translation Arabic (MTA) Part 2 (LDC2005T05)</a>. The reference translations are a reorganized subset of data from these same Multiple-Translation corpora. All source data for this corpus is newswire text collected in January and February of 2003 from Agence France-Presse, and Xinhua News Agency. For details on the methodology of the source data collection and production of reference translations, see the documentation for the above-mentioned corpora.</p><br>
<p>For each language, the test set consists of two files, a source and a reference file. Each reference file contains four independent translations of the data set. The evaluation year, source language, test set (which, by default, is "evalset"), version of the data, and source vs. reference file (with the latter being indicated by "-ref") are reflected in the file name.</p><br>
<p>DARPA TIDES MT and NIST OpenMT evaluations used SGML-formatted test data until 2008 and XML-formatted test data thereafter. The files in this package are provided in both formats.</p><br>
<h3>Sample</h3><br>
<p><a href="desc/addenda/LDC2010T11_sample.txt" rel="nofollow">Sample text file</a> containing excerpts from different xml files included in this corpus, including reference translations and source text for a single newswire document. The file is encoded in UTF-8.</p><br>
<h3>Updates</h3><br>
<p>There are no updates available at this time.</p></br>
Portions © 2003 Agence France-Presse, © 2003 Xinhua News Agency, © 2004-2006, 2010 Trustees of the University of Pennsylvania.
<h3>引言</h3><br>
<p>NIST 2003开放机器翻译(Open Machine Translation,OpenMT)评测是一个包含源数据、参考译文和评分软件的数据包,用于NIST 2003 OpenMT评测。其旨在帮助评估机器翻译系统的有效性。该数据包由NIST的研究人员编译,评分软件亦由其开发,所用的新闻专线源数据及参考译文由语言数据联盟(Linguistic Data Consortium,LDC)收集和整理。</p><br>
<p>NIST OpenMT评测系列的目标是支持机器翻译(machine translation,MT)技术的研究,并助力其前沿水平的提升——这类技术可实现人类语言间的文本翻译。输入可涵盖所有形式的文本,目标是输出对原文充分且流畅的译文。</p><br>
<p>MT评测系列始于2001年,是DARPA TIDES(Translingual Information Detection, Extraction)项目的一部分。自2006年评测起,该系列由NIST主导和协调,正式命名为NIST OpenMT。这些评测为MT领域的研究方向和技术能力校准提供了重要参考。OpenMT评测旨在吸引所有致力于解决人类语言间自动翻译通用问题的研究人员关注,为此其设计力求简洁、聚焦核心技术问题并提供全面支持。2003年的任务是评估从中文到英文及阿拉伯语到英文的翻译效果。</p><br>
<p>更多相关信息可访问<a href="https://www.nist.gov/itl/iad/mig/open-machine-translation-evaluation" rel="nofollow">NIST开放机器翻译(OpenMT)评测网站</a>。</p><br>
<h3>评分工具</h3><br>
<p>本评测工具包包含一个Perl脚本(mteval-v09c.pl),可用于为一个(或多个)MT系统生成翻译质量得分。该脚本通过将系统输出译文与同一源文本的一组(专家级)参考译文进行比较实现功能,比较基础是在参考译文中寻找与系统输出译文中匹配的词序列。关于该评测算法的更多信息可从详细描述该算法的论文获取:<a href="http://www.aclweb.org/anthology/P/P02/P02-1040.pdf" rel="nofollow">《BLEU:机器翻译自动评估方法》(Papineni等,2002)</a>。</p><br>
<p>所包含的评分脚本随原始评测一同发布,旨在用于SGML格式的数据文件,其提供是为确保用户评分结果与原始评测结果的兼容性。一个更新后的评分软件包(mteval-v13a-20091001.tar.gz)——支持XML格式、包含额外选项和漏洞修复、附带文档及示例译文——可从<a href="https://www.nist.gov/itl/iad/mig/tools" rel="nofollow">NIST多模态信息组工具网站</a>下载。</p><br>
<h3>数据</h3><br>
<p>本语料库包含的中文和阿拉伯语源文本是对最初分别以《多译文中文(Multiple-Translation Chinese,MTC)第4部分(LDC2006T04)》和《多译文阿拉伯语(Multiple-Translation Arabic,MTA)第2部分(LDC2005T05)》向公众发布的数据的重组。参考译文是来自这些相同多译文语料库的数据的重组子集。本语料库的所有源数据均为2003年1月至2月从法新社(Agence France-Presse)和新华社收集的新闻专线文本。关于源数据收集方法和参考译文生成的详细信息,请参阅上述语料库的文档。</p><br>
<p>对于每种语言,测试集包含两个文件:源文件和参考文件。每个参考文件包含该数据集的四个独立译文。文件名反映了评测年份、源语言、测试集(默认值为“evalset”)、数据版本以及源文件与参考文件(后者由“-ref”标识)。</p><br>
<p>DARPA TIDES MT和NIST OpenMT评测在2008年前使用SGML格式的测试数据,之后使用XML格式的测试数据。本数据包中的文件以两种格式提供。</p><br>
<h3>样本</h3><br>
<p><a href="desc/addenda/LDC2010T11_sample.txt" rel="nofollow">样本文本文件</a>包含本语料库中不同XML文件的摘录,包括单个新闻专线文档的参考译文和源文本。该文件采用UTF-8编码。</p><br>
<h3>更新</h3><br>
<p>目前暂无可用更新。</p></br>
部分内容©2003法新社,©2003新华社,©2004-2006、2010宾夕法尼亚大学董事会。
提供机构:
Linguistic Data Consortium
创建时间:
2020-11-30



