five

ICP-MS measurements and R Code to determine provenance soil type from analyses of Pinus ponderosa ash collected in Arizona and Colorado

收藏
DataCite Commons2025-09-11 更新2026-05-07 收录
下载链接:
https://www.sciencebase.gov/catalog/item/67b8bde4d34e1a2e835b7ee9
下载链接
链接失效反馈
官方服务:
资源简介:
Needle samples from 140 Pinus ponderosa were analyzed for trace elements via inductively coupled plasma mass spectrometry (ICP-MS). Samples were collected from eight locations representing five distinct soil types across the Colorado Plateau near Flagstaff, Arizona and Boulder, Colorado. Data includes a full spectral scan between m/z 5-245 which was condensed to 72 dominant atomic masses. Instrument drift, matrix effects, and differing sample mass were corrected for using internal standard ion count intensities and individual sample weight. Further statistical analysis was performed using R executed within the RStudio environment. Classification was performed using three preprocessing techniques and five machine learning algorithms, including hierarchical modeling structures to optimize separation. Data files provided here include the metadata, PineAsh_Metadata.xml, a Microsoft Office Excel file, PineAsh_Data.xlsx, with six spreadsheets containing the Introduction, and data tables T01-T05, and the individual data tables as comma-separated value .csv files. T01_Pinus_ponderosa_DataDiction.csv is the data dictionary containing entity and attribute metadata in table format for T02-05, T02_ICP_Mass_Spectrum.csv is the mass spectral scan results of 140 samples across m/z range 5-245, T03_ICP_Mass_Spec_Censored.csv is the mass spectral scan results of 140 samples from 72 dominant atomic masses, T04_IntStd_Corrected.csv is the mass spectral scan results of 140 samples from 72 dominant atomic masses normalized using internal standard ion count intensities and T05_Mass_Corrected.csv is the mass spectral scan results of 140 samples from 72 dominant atomic masses normalized using internal standard ion count intensities and sample mass. R code includes three scripts: Preprocessing_ModelTraining.R which is the data preprocessing and model training script that evaluates multiple preprocessing strategies and classification models to determine the best combination. It applies transformations, partitions the dataset, and conducts cross-validation to assess model performance across different classification algorithms and structures. The required data file, ICP_8pineAsh_onlyAtoms.mat, is included. ModelSelection_VariableImportance.R is the best model selection and variable importance script that uses the best-performing model based on accuracy and Cohen’s kappa, it calculates variable importance using model-specific and model-independent metrics to determine the most influential features for classification. The required data file, Ash_rework.csv, is included. EnhancedVisualization.R is a visualization script that refines and enhances the visual presentation of results from the prior scripts, creating clearer and more aesthetically polished figures for better interpretation and presentation. Photo of a smoke plume from a managed wildfire rising above Ponderosa pine trees in New Mexico can be found in the USGS Images Library at https://www.usgs.gov/media/images/a-convective-smoke-plume-a-managed-wildfire.
提供机构:
U.S. Geological Survey
创建时间:
2025-07-22
二维码
社区交流群
二维码
科研交流群
商业服务