European Parliamentary Proceedings Translation Memory
收藏arXiv2025-09-30 收录
下载链接:
https://joint-research-centre.ec.europa.eu/language-technology-resources/dgt-translation-memory_en
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含了从欧洲委员会DGT-翻译记忆库获取的393,371组源语言到目标语言的欧洲议会进程文本对。数据集被划分为70%的训练集、20%的开发集和10%的测试集,规模达到了393,371组源语言-目标语言对,其任务是机器翻译。
This dataset consists of 393,371 source-to-target language text pairs from European Parliament proceedings, sourced from the European Commission's DGT Translation Memory. It is split into 70% for training, 20% for development, and 10% for testing, and the core task of this dataset is machine translation.
提供机构:
European Commission



