Semi-supervised machine learning approaches for predicting the chronology of archaeological sites: A case study of temples from medieval Angkor, Cambodia
收藏Figshare2018-11-05 更新2026-04-29 收录
下载链接:
https://figshare.com/articles/dataset/Semi-supervised_machine_learning_approaches_for_predicting_the_chronology_of_archaeological_sites_A_case_study_of_temples_from_medieval_Angkor_Cambodia/7298273
下载链接
链接失效反馈官方服务:
资源简介:
Archaeologists often need to date and group artifact types to discern typologies, chronologies, and classifications. For over a century, statisticians have been using classification and clustering techniques to infer patterns in data that can be defined by algorithms. In the case of archaeology, linear regression algorithms are often used to chronologically date features and sites, and pattern recognition is used to develop typologies and classifications. However, archaeological data is often expensive to collect, and analyses are often limited by poor sample sizes and datasets. Here we show that recent advances in computation allow archaeologists to use machine learning based on much of the same statistical theory to address more complex problems using increased computing power and larger and incomplete datasets. This paper approaches the problem of predicting the chronology of archaeological sites through a case study of medieval temples in Angkor, Cambodia. For this study, we have a large dataset of temples with known architectural elements and artifacts; however, less than ten percent of the sample of temples have known dates, and much of the attribute data is incomplete. Our results suggest that the algorithms can predict dates for temples from 821–1150 CE with a 49-66-year average absolute error. We find that this method surpasses traditional supervised and unsupervised statistical approaches for under-specified portions of the dataset and is a promising new method for anthropological inquiry.
考古学家常需对器物类型进行断代与分组,以厘清类型学、年代学与分类体系。百余年来,统计学者始终借助分类与聚类技术,从可通过算法界定的数据中推演潜在模式。就考古学领域而言,线性回归(Linear Regression)算法常被用于对考古遗迹与遗址进行年代断代,模式识别(Pattern Recognition)则被用于构建类型学与分类体系。然而考古数据的采集往往成本不菲,且分析工作常受限于样本规模不足与数据集质量欠佳的问题。当前计算技术的进步,使得考古学者能够依托与传统统计学同源的机器学习(Machine Learning)方法,借助更强的算力以及更大规模、存在缺失值的数据集,解决更为复杂的考古问题。本文以柬埔寨吴哥(Angkor)中世纪寺庙群为研究案例,探讨考古遗址年代预测的相关问题。本研究使用的大型数据集涵盖了已知建筑元素与出土器物的寺庙样本,但其中仅有不足10%的寺庙拥有明确的已知年代,且大量属性数据存在缺失。研究结果表明,该算法可对公元821年至1150年的寺庙进行年代预测,平均绝对误差为49至66年。我们发现,针对数据集中标注不足的部分,该方法的表现优于传统有监督(Supervised)与无监督(Unsupervised)统计方法,是人类学研究中极具潜力的新研究路径。
创建时间:
2018-11-05



