Predictive Modeling of Empty Puparia Age Using Cuticular Hydrocarbon Concentrations: A Machine Learning Approach
收藏Mendeley Data2024-05-29 更新2024-06-26 收录
下载链接:
https://data.mendeley.com/datasets/m68y9brvfw
下载链接
链接失效反馈官方服务:
资源简介:
This dataset comprises concentration measurements of four different cuticular hydrocarbons (Pentacosane - C25, Heptacosane - C27, Octacosane - C28, and Nonacosane - C29) extracted from empty puparia of Calliphora vicina. These puparia were stored in both paper towel and soil pupation mediums under controlled laboratory conditions. The measurements were taken at various ages of the empty puparia, with age recorded in days and concentrations measured in nanograms per microliter (ng/µL). Each row in the dataset represents a specific observation, detailing the age of the empty puparia along with the concentrations of the four hydrocarbons. These concentrations are expressed in units of ng/µL, indicating the quantity of each hydrocarbon present in a microliter of the extraction solution. For analysis, two machine learning models, Support Vector Machine (SVM) and eXtreme Gradient Boosting (XGBoost), were employed to accommodate the unique characteristics of the dataset. These models utilized the concentrations of n-C25, n-C27, n-C28, and n-C29 hydrocarbons to predict the age of the empty puparia. The dataset is supplemented with three files: one Excel sheet containing concentration measurements of empty puparia investigated over 180 days in laboratory conditions, and two Word files containing R script codes for implementing the SVM and XGBoost machine learning algorithms to estimate the age of the empty puparia.
本数据集包含从丽蝇(Calliphora vicina)空蛹壳中提取的四种表皮碳氢化合物(cuticular hydrocarbons)的浓度检测数据,具体为二十五烷(Pentacosane - C25)、二十七烷(Heptacosane - C27)、二十八烷(Octacosane - C28)及二十九烷(Nonacosane - C29)。这些空蛹壳分别以纸巾与土壤为化蛹基质,在实验室可控环境中保存。检测在空蛹壳的不同日龄下开展,日龄以天为单位记录,浓度单位为纳克每微升(ng/µL)。本数据集每一行对应一条具体观测记录,详细记录了空蛹壳的日龄与四种碳氢化合物的浓度,浓度单位均为ng/µL,即每微升提取液中所含对应碳氢化合物的质量。为开展数据分析,本研究采用支持向量机(Support Vector Machine, SVM)与极限梯度提升(eXtreme Gradient Boosting, XGBoost)两种机器学习模型,以适配本数据集的独有特性;两类模型均通过n-C25、n-C27、n-C28及n-C29四种碳氢化合物的浓度数据,对空蛹壳的日龄进行预测。本数据集附带三份附属文件:其一为Excel表格,收录了实验室环境下180天观测周期内空蛹壳的浓度检测数据;另外两份为Word文档,分别包含用于实现SVM与XGBoost机器学习算法以估算空蛹壳日龄的R脚本代码。
创建时间:
2024-05-26



