Multi-Type Ancient Chinese Character Recognition (MTACCR) dataset
收藏Figshare2025-06-08 更新2026-04-28 收录
下载链接:
https://figshare.com/articles/dataset/_b_M_b_ulti-_b_T_b_ype_b_A_b_ncient_b_C_b_hinese_b_C_b_haracter_b_R_b_ecognition_MTACCR_dataset/29263991
下载链接
链接失效反馈官方服务:
资源简介:
The Multi-Type Ancient Chinese Character Recognition (MTACCR) dataset is a large-scale resource designed to advance research in ancient Chinese script analysis. It is constructed based on the Table of General Standard Chinese Characters (通用规范汉字表), covering 7,874 Chinese characters across three levels (3,500 Level-1, 3,000 Level-2, and 1,605 Level-3) and including standard, traditional, and variant forms. With over 9 million samples, the dataset features diverse ancient character images—ranging from original scanned manuscripts to segmented glyphs—collected from calligraphic databases, open-source datasets, and data augmentation. MTACCR significantly surpasses existing datasets in character coverage, typological diversity, and scale, providing a comprehensive benchmark for recognition and historical linguistics studies.
创建时间:
2025-06-08



