Impact of experimental group, disease, mutation, age and sex on lymphoblast metabolic profiles.
收藏DataCite Commons2022-03-10 更新2024-08-18 收录
下载链接:
https://figshare.com/articles/dataset/Impact_of_experimental_group_disease_mutation_age_and_sex_on_lymphoblast_metabolic_profiles_/19229268
下载链接
链接失效反馈官方服务:
资源简介:
Mutual Information (Information Gain) evidenced the features with more discriminative power between experimental conditions and enabled the selection of the most discriminant features for subsequent principal component analysis (PCA).<br> We further evaluated the performance of a Naïve Bayes classifier with leave-one-out cross-validation using the selected features. Confusion matrices represent the number of true positives (TP), true negatives (TN), false positives (FP) and false negatives (FN) in percent of the total real instances for each class. The following metrics were calculated: Area Under the Curve (AUC); Classification Accuracy (CA)=(TP + TN)/(TP + TN + FP + FN); F1 score = 2 x ((Precision x Recall)/(Precision + Recall)); Precision = TP/(TP + FP); Recall = TP/(TP + FN).
互信息(Mutual Information,Information Gain)能够体现不同实验条件间更具区分度的特征,可用于筛选最具判别性的特征,以供后续的主成分分析(Principal Component Analysis,PCA)使用。
我们进一步利用筛选得到的特征,结合留一法交叉验证(leave-one-out cross-validation)评估了朴素贝叶斯分类器(Naïve Bayes classifier)的性能。混淆矩阵(Confusion Matrix)以各类别真实样本总数的百分比形式,展示了真阳性(True Positives,TP)、真阴性(True Negatives,TN)、假阳性(False Positives,FP)以及假阴性(False Negatives,FN)的数量。本次计算得到以下指标:曲线下面积(Area Under the Curve,AUC);分类准确率(Classification Accuracy,CA)=(TP + TN)/(TP + TN + FP + FN);F1分数(F1 score)= 2 × ((精确率 × 召回率)/(精确率 + 召回率));精确率(Precision)= TP/(TP + FP);召回率(Recall)= TP/(TP + FN)。
提供机构:
figshare
创建时间:
2022-02-25



