MLe2e
收藏arXiv2025-09-30 收录
下载链接:
https://github.com/lluisgomez/script_identification
下载链接
链接失效反馈官方服务:
资源简介:
该数据集名为MLe2e,是一个多语言数据集,包含711张图片,其中有1178个单词实例,涵盖了拉丁文、中文、卡纳达文和韩文四种不同的书写系统。此外,该数据集特别关注多种书写系统的特点。其任务是进行书写系统的聚类。
This dataset, named MLe2e, is a multilingual resource containing 711 images and 1178 word instances spanning four distinct writing systems: Latin, Chinese, Kannada, and Korean. Furthermore, this dataset specifically focuses on the characteristics of diverse writing systems, and its associated task is writing system clustering.



