Training Data for Machine Translation
收藏arXiv2025-09-30 收录
下载链接:
https://github.com/dimitarsh1/BiasMT
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含了用于评估不同机器翻译范式所产生的翻译文本的语言丰富度和多样性训练数据。此外,该数据集还用于比较训练数据的多样性及其与机器翻译输出的差异,并采用多种指标进行分析。其规模较大,旨在评估机器翻译中的语言复杂性和多样性。
This dataset contains training data dedicated to evaluating the linguistic richness and diversity of translated texts produced by diverse machine translation paradigms. Furthermore, it can be utilized to compare the diversity of the training data and the discrepancies between this data and machine translation outputs, with analysis conducted via multiple evaluation metrics. With a substantial scale, this dataset is purpose-built to assess linguistic complexity and diversity in machine translation.



