MultiIndicMT dataset
收藏arXiv2025-09-30 收录
下载链接:
http://lotus.kuee.kyoto-u.ac.jp/WAT/indic-multilingual/
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含了从10种印度语言翻译成英语的配对,根据不同的语言组合,配对数量在23,000到50,000对之间。此外,该数据集特别适用于低资源语言环境。在资源规模上,它属于资源较少的范畴,其任务是机器翻译。
This dataset comprises sentence pairs translated from 10 Indian languages into English, with the number of pairs ranging from 23,000 to 50,000 depending on specific language combinations. Furthermore, this dataset is particularly suitable for low-resource language environments. In terms of resource scale, it falls into the low-resource category, and its target task is machine translation.
提供机构:
Workshop on Asian Translation (WAT)



