WMT 2017
收藏OpenXLab2026-04-18 收录
下载链接:
https://openxlab.org.cn/datasets/OpenDataLab/WMT 2017
下载链接
链接失效反馈官方服务:
资源简介:
We provide training data for seven language pairs, and a common framework. The task is to improve methods current methods. This can be done in many ways. For instance participants could try to:
improve word alignment quality, phrase extraction, phrase scoring
add new components to the open source software of the baseline system
augment the system otherwise (e.g. by preprocessing, reranking, etc.)
build an entirely new translation systems
Participants will use their systems to translate a test set of unseen sentences in the source language. The translation quality is measured by a manual evaluation and various automatic evaluation metrics. Participants agree to contribute to the manual evaluation about eight hours of work.
You may participate in any or all of the seven language pairs. For all language pairs we will test translation in both directions. To have a common framework that allows for comparable results, and also to lower the barrier to entry, we provide a common training set.
We also strongly encourage your participation, if you use your own training corpus, your own sentence alignment, your own language model, or your own decoder.
If you use additional training data or existing translation systems, you must flag that your system uses additional data. We will distinguish system submissions that used the provided training data (constrained) from submissions that used significant additional data resources. Note that basic linguistic tools such as taggers, parsers, or morphological analyzers are allowed in the constrained condition.
Your submission report should highlight in which ways your own methods and data differ from the standard task. We may break down submitted results in different tracks, based on what resources were used. We are mostly interested in submission that are constrained to the provided training data, so that the comparison is focused on the methods, not on the data used. You may submit contrastive runs to demonstrate the benefit of additional training data.
提供机构:
OpenDataLab
创建时间:
2023-12-07



