Data from: Pacman profiling: a simple procedure to identify stratigraphic outliers in high-density deep-sea microfossil data
收藏DataONE2011-07-08 更新2024-06-27 收录
下载链接:
https://search.dataone.org/view/null
下载链接
链接失效反馈官方服务:
资源简介:
The deep-sea microfossil record is characterized by an extraordinarily high density and abundance of fossil specimens, and by a very high degree of spatial and temporal continuity of sedimentation. This record provides a unique opportunity to study evolution at the species level for entire clades of organisms. Compilations of deep-sea microfossil species occurrences are, however, affected by reworking of material, age model errors, and taxonomic uncertainties, all of which combine to displace a small fraction of the recorded occurrence data both forward and backwards in time, extending total stratigraphic ranges for taxa. These data outliers introduce substantial errors into both biostratigraphic and evolutionary analyses of species occurrences over time. We propose a simple method—Pacman—to identify and remove outliers from such data, and to identify problematic samples or sections from which the outlier data have derived. The method consists of, for a large group of species, compiling species occurrences by time and marking as outliers calibrated fractions of the youngest and oldest occurrence data for each species. A subset of biostratigraphic marker species whose ranges have been previously documented is used to calibrate the fraction of occurrences to mark as outliers. These outlier occurrences are compiled for samples, and profiles of outlier frequency are made from the sections used to compile the data; the profiles can then identify samples and sections with problematic data caused, for example, by taxonomic errors, incorrect age models, or reworking of sediment. These samples/sections can then be targeted for re-study.
深海微化石记录以化石标本密度极高、丰度充足,且沉积作用具有极高的空间与时间连续性为典型特征。该记录为研究整个生物类群的物种级演化提供了独一无二的契机。然而,深海微化石物种出现数据的汇编工作会受到沉积物再改造、年代模型误差以及分类学不确定性的影响;所有这些因素共同作用,会使一小部分记录的出现数据在时间轴上前后偏移,进而延长了类群的总地层延限范围。这些数据异常值会对物种出现数据的生物地层学及演化时序分析引入显著误差。我们提出了一种简易方法——Pacman——用于从这类数据中识别并剔除异常值,同时可定位异常数据来源的存在问题的样品或地层剖面。该方法的核心流程为:针对大量物种集群,按时间轴整理其物种出现记录,并将每个物种的最年轻与最古老出现数据中经校准比例的部分标记为异常值。我们将使用一批此前已被记录过延限范围的生物地层标志种种群子集,来校准需要标记为异常值的出现数据比例。将这些异常出现数据按样品进行汇编,并基于用于构建数据集的地层剖面生成异常值频率剖面;通过该剖面可识别出存在问题的样品与剖面——这类问题可能源自分类学误差、错误的年代模型或沉积物再改造作用。后续可针对这些存在问题的样品/剖面开展重新研究。
创建时间:
2011-07-08



