IWSLT15
收藏arXiv2025-09-30 收录
下载链接:
http://nlp.stanford.edu/projects/nmt/
下载链接
链接失效反馈官方服务:
资源简介:
该数据集为IWSLT15,包含了英越语翻译对,总计133,000对。在数据划分上,使用TED tst2012的1,553对作为开发集,TED tst2013的1,268对作为测试集。英语和越南语的词汇量分别为17,000和7,700。该数据集的任务是进行同声传译机翻译。
This dataset is IWSLT15, consisting of 133,000 English-Vietnamese translation pairs. For data partitioning, 1,553 pairs sourced from TED tst2012 are adopted as the development set, while 1,268 pairs from TED tst2013 are utilized as the test set. The vocabulary sizes for English and Vietnamese are 17,000 and 7,700 respectively. The task of this dataset is simultaneous machine translation.
提供机构:
Stanford NLP



