johnlockejrr/RASAM
收藏Hugging Face2024-07-02 更新2024-07-06 收录
下载链接:
https://hf-mirror.com/datasets/johnlockejrr/RASAM
下载链接
链接失效反馈官方服务:
资源简介:
RASAM数据集是一个用于阿拉伯马格里布手写体识别和分析的开放数据集。该数据集由300张图像及其对应的XML格式的标注文件组成,图像来自BULAC图书馆的三份手稿,涵盖历史和法律领域。数据集提供了三个层次的注释:文本区域、基线和文本。该数据集旨在为这些资源匮乏的手写体训练手写文本识别(HTR)模型提供参考。数据集是在2021年1月至4月期间由中东和穆斯林世界研究联盟(GIS MOMM)和Calfa组织的协作黑客马拉松中创建的。
The RASAM dataset is an open dataset for the recognition and analysis of scripts in Arabic Maghrebi. It was produced through a collaborative hackathon organized by GIS MOMM and Calfa, covering a representative part of the handwritten production in Arabic Maghrebi scripts, aiming to provide a reference dataset for training HTR models for these under-resourced scripts. The dataset includes 300 images with their related ground truth files in XML format, sourced from three manuscripts from the BULAC collection. The datasets annotations are divided into three levels: text-regions, baselines, and texts. The annotation work was carried out on the Calfa Vision platform, following collectively decided transcription specifications.
提供机构:
johnlockejrr



