Ontology-Structured Dataset of Imperial Examination

Name: Ontology-Structured Dataset of Imperial Examination
Creator: Science Data Bank
Published: 2025-07-25 01:13:56
License: 暂无描述

DataCite Commons2025-07-25 更新2026-05-05 收录

下载链接：

https://www.scidb.cn/detail?dataSetId=50587b48899c47f1ae21c880fc55a0fb

下载链接

链接失效反馈

官方服务：

资源简介：

The imperial examination system was the core mechanism for selecting officials and promoting the mobility of social elites during the Ming and Qing dynasties. Systematically sorting out the information of imperial examination figures has important academic value in revealing China's traditional social structure, talent mobility, and regional cultural changes. We are based on ancient books and documents such as "Record of Ascending to the Imperial Examination", "Record of Offering and Conquering", "Preliminary Collection of Inscriptions and Steles for the Imperial Calendar", and "Imperial Tribute Examination". Drawing on the seven step method, we use a bottom-up modeling approach to construct the ontology. In the modeling process, the principle of minimum sufficient expression was followed, striving to comprehensively represent the knowledge of the target domain, and ultimately constructing the ontology structure of the Ming Dynasty imperial examination. The designed ontology structure covers four core entities: "imperial examinations, characters, officials, and locations", integrating key information about social mobility, identity history, and spatial distribution from ancient texts, thus achieving accurate modeling of the social elite mobility mechanism during the Ming and Qing dynasties. A high-quality ontology dataset containing 529 ancient book instance texts was constructed under the guidance of domain experts based on the ontology structure. This ontology structured dataset not only provides a standardized basis for the systematic evaluation of the inference performance of large language models, but also provides high-quality and structured data resources for the pre training tasks of large models in the field of ancient literature. The constructed ontology dataset covers 3681 entities, 7393 explicit attributes, 3839 implicit attributes, and 3100 object attributes. Considering that the knowledge content in ancient imperial examination texts is mostly expressed in sentences or paragraphs, this article systematically controls the length (complexity) variable of the input text and divides the constructed dataset into Group 1 (single sentence text) and Group 2 (paragraph text).

提供机构：

Science Data Bank

创建时间：

2025-07-25

5,000+

优质数据集

54 个

任务类型

进入经典数据集