five

Multi-Type Ancient Chinese Character Recognition (MTACCR) dataset

收藏
Figshare2025-06-08 更新2026-04-28 收录
下载链接:
https://figshare.com/articles/dataset/_b_M_b_ulti-_b_T_b_ype_b_A_b_ncient_b_C_b_hinese_b_C_b_haracter_b_R_b_ecognition_MTACCR_dataset/29263991
下载链接
链接失效反馈
官方服务:
资源简介:
​​The Multi-Type Ancient Chinese Character Recognition (MTACCR) dataset​​ is a large-scale resource designed to advance research in ancient Chinese script analysis. It is constructed based on the Table of General Standard Chinese Characters (通用规范汉字表), covering ​​7,874 Chinese characters​​ across three levels (3,500 Level-1, 3,000 Level-2, and 1,605 Level-3) and including standard, traditional, and variant forms. With ​​over 9 million samples​​, the dataset features diverse ancient character images—ranging from original scanned manuscripts to segmented glyphs—collected from calligraphic databases, open-source datasets, and data augmentation. MTACCR significantly surpasses existing datasets in character coverage, typological diversity, and scale, providing a comprehensive benchmark for recognition and historical linguistics studies.
创建时间:
2025-06-08
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作