German-Upper Sorbian (de-hsb) parallel corpus
收藏arXiv2025-09-30 收录
下载链接:
https://www.statmt.org/wmt20/unsup_and_very_low_res/
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含了用于低资源语言翻译任务训练、验证和测试的平行句子。该数据集旨在通过高质量的合成数据增强神经机器翻译(NMT)模型的能力。具体规模方面,训练集包含60,000个句子,而验证集和测试集则各有2,000个句子。
This dataset contains parallel sentence pairs for training, validation, and testing of low-resource language translation tasks. It aims to enhance the capabilities of Neural Machine Translation (NMT) models using high-quality synthetic data. In terms of scale, the training set includes 60,000 sentence pairs, while the validation and test sets each have 2,000 sentence pairs.
提供机构:
WMT



