AH-Translit Benchmark Dataset
收藏India Data2026-01-29 更新2026-05-16 收录
下载链接:
https://india-data.org/dataset-details/759e2466-b6d4-460a-a1fe-61207e885b1f
下载链接
链接失效反馈官方服务:
资源简介:
The AH-Translit_Bench dataset provides a collection of parallel text pairs for evaluating and developing Arabic to Hindi transliteration systems. It comprises text from three distinct domains: Al-Quran, bibliographical entries, and a Modern Standard Arabic (MSA) category, ensuring a broad coverage for robust model development. Each entry consists of an Arabic string and its manually curated transliteration into Hindi (Devanagari script). The datasets structure allows for straightforward loading and integration into various machine learning frameworks.
提供机构:
Natural Language Processing (NLP)
创建时间:
2025-09-17



