five

Learning to see the wood for the trees: machine learning, decision trees and the classification of isolated theropod teeth

收藏
DataONE2021-03-03 更新2025-05-03 收录
下载链接:
https://search.dataone.org/view/sha256:2bc8b0c007ae444a3619da3ba3a7fa4a8351f5284a60798c39ef1caa226035fe
下载链接
链接失效反馈
官方服务:
资源简介:
Taxonomic identification of fossils based on morphometric data traditionally relies on the use of standard linear models to classify such data. Machine learning and decision trees offer powerful alternative approaches to this problem but are not widely used in palaeontology. Here, we apply these techniques to published morphometric data of isolated theropod teeth in order to explore their utility in tackling taxonomic problems. We chose two published datasets consisting of 886 teeth from 14 taxa and 3020 teeth from 17 taxa, respectively, each with five morphometric variables per tooth. We also explored the effects that missing data have on the final classification accuracy. Our results suggest that machine learning and decision trees yield superior classification results over a wide range of data permutations, with decision trees achieving accuracies of 96% in classifying test data in some cases. Missing data or attempts to generate synthetic data to overcome missing data seriously degr...

基于形态测量数据(morphometric data)的化石分类鉴定(taxonomic identification)传统上依赖于使用标准线性模型对这类数据进行分类。机器学习和决策树为该问题提供了强有力的替代方法,但在古生物学(palaeontology)中应用尚不广泛。本文将这些技术应用于已发表的孤立兽脚亚目恐龙(theropod)牙齿形态测量数据,以探索其在解决分类问题中的实用性。我们选取了两个已发表的数据集,分别包含来自14个分类群(taxa)的886颗牙齿和来自17个分类群的3020颗牙齿,每个牙齿均具有5个形态测量变量。我们还探究了缺失数据对最终分类准确率(classification accuracy)的影响。研究结果表明,在多种数据排列组合情况下,机器学习和决策树均能产生更优的分类结果;在某些情况下,决策树对测试数据的分类准确率可达96%。缺失数据或尝试生成合成数据(synthetic data)以弥补缺失数据的做法会严重降低...
创建时间:
2025-04-20
二维码
社区交流群
二维码
科研交流群
商业服务