scb-mt-en-th-2020
收藏arXiv2025-09-30 收录
下载链接:
https://github.com/vistec-ai/dataset-releases/releases/tag/scb-mt-en-th-2020_v1.0
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含了1,001,752组英泰语句子对,其中63,982组句子对被随机选出,以匹配CS数据集的样本量。此外,该数据集被用于评估不同机器翻译模型的性能。其规模达到1,001,752组,任务是对英泰机器翻译进行评估。
This dataset contains 1,001,752 English-Thai sentence pairs. A total of 63,982 pairs were randomly selected from them to match the sample size of the CS dataset. This dataset is employed to assess the performance of different machine translation models, and its designated task is English-Thai machine translation evaluation.
提供机构:
SCB



