WMT 2015
收藏arXiv2025-09-30 收录
下载链接:
http://www.statmt.org/wmt15/index.html
下载链接
链接失效反馈官方服务:
资源简介:
该数据集为WMT15数据集,包含了德语到英语的翻译对,总计达到了450万对。在数据划分上,使用了newstest2013(3000对)作为开发集,newstest2015(2169对)作为测试集。同时,应用了32K合并操作的分词编码(BPE),并共享了词汇表。该数据集的任务是进行同声传译机器翻译。
This is the WMT15 dataset, which contains 4.5 million German-to-English translation pairs. For data partitioning, newstest2013 (3,000 pairs) is employed as the development set, while newstest2015 (2,169 pairs) serves as the test set. Additionally, 32K merge-based Byte Pair Encoding (BPE) tokenization is applied, and a shared vocabulary is adopted. The task supported by this dataset is simultaneous machine translation.
提供机构:
WMT



