IndicMT Eval
收藏arXiv2023-07-03 更新2024-06-21 收录
下载链接:
https://github.com/AI4Bharat/IndicMT-Eval
下载链接
链接失效反馈官方服务:
资源简介:
IndicMT Eval是一个包含7000个细致标注的多维质量指标(MQM)数据集,涵盖5种印度语言和7个机器翻译系统。该数据集由印度理工学院马德拉斯分校创建,旨在评估印度语言的机器翻译质量。数据集内容包括从英语到印度语言的翻译输出,由经验丰富的语言专家根据MQM指南进行标注。该数据集的应用领域主要集中在机器翻译的评估和改进,特别是在解决低资源语言翻译质量评估的问题上。
IndicMT Eval is a dataset containing 7,000 meticulously annotated instances adopting the multidimensional quality metrics (MQM) framework, covering 5 Indian languages and 7 machine translation systems. It was developed by the Indian Institute of Technology Madras, with the core objective of evaluating machine translation quality for Indian languages. The dataset comprises machine translation outputs from English into the target Indian languages, which were manually annotated by seasoned linguistic experts in compliance with the official MQM annotation guidelines. Its primary applications focus on the evaluation and enhancement of machine translation, particularly in addressing the quality assessment challenges of low-resource language translation.
提供机构:
印度理工学院马德拉斯分校
创建时间:
2022-12-20



