ILCI corpus

Name: ILCI corpus
Creator: Indian Language Corpora Initiative
License: 暂无描述

arXiv2025-09-30 收录

下载链接：

http://www.unicode.org/reports/tr35/

下载链接

链接失效反馈

官方服务：

资源简介：

该数据集是一个包含印地语及英语的11种平行语料库，它在文献中被广泛用于多源神经机器翻译评估任务中。

This dataset comprises 11 parallel corpora covering Hindi and English, which has been widely utilized for multi-source neural machine translation evaluation tasks in academic literature.

提供机构：

Indian Language Corpora Initiative

5,000+

优质数据集

54 个

任务类型

进入经典数据集