Dongba1800
收藏科学数据银行2024-09-12 更新2026-04-23 收录
下载链接:
https://www.scidb.cn/detail?dataSetId=fde9b63f7ecc4eef9560a377717098f6
下载链接
链接失效反馈官方服务:
资源简介:
Dataset for Single Character Detection in Dongba Manuscripts. It includes 1,800 curated JPEG image files and 1,800 text annotation files in TXT format. All files are named in a consistent format to ensure easy indexing and association between images and their corresponding annotations: JPEG images are named 'image_<number>.jpg' (e.g., 'image_1.jpg'), and TXT files are named 'gt_image_<number>.txt' (e.g., 'gt_image_1.txt'). In these TXT files, annotations of Dongba characters include a verified total of 111,702 characters, ensuring the accuracy and reliability of the data. Each character's spatial position is identified by a series of coordinate pairs that define the polygonal boundaries of the text boxes. For example, the coordinate sequence "161, 59, 202, 57, 256, 85, 239, 154, 182, 147, 163, 107" represents the vertices of a polygon, with each pair like "161, 59" indicating the x and y coordinates of a vertex. Coordinates are typically listed in a clockwise direction to comprehensively outline the full contour of the polygon. To differentiate between records, the annotation files use "###" as a delimiter to signify the end of a record.
提供机构:
Shanxiong Chen; Yongbo Li; Yuqi Ma
创建时间:
2024-09-09



