Feature selection and molecular classification of cancer phenotypes: a comparative study

Name: Feature selection and molecular classification of cancer phenotypes: a comparative study
Creator: Centro di Ateneo per le Biblioteche dell'Università degli Studi di Padova
Published: 2022-08-09 15:10:30
License: 暂无描述

DataCite Commons2022-08-09 更新2024-07-13 收录

下载链接：

http://researchdata.cab.unipd.it/id/eprint/679

下载链接

链接失效反馈

官方服务：

资源简介：

Classification of high dimensional gene expression data is key to the development of effective di-agnostic and prognostic tools. Feature selection involves finding the best subset with the highest power in predicting class labels. We here conducted a comparative study focused on different combinations of feature selectors (Chi-Squared, mRMR, Relief-F, Genetic Algorithms) and classi-fication learning algorithms (Random Forests, PLS-DA, SVM, Regularized Logistic/Multinomial Regression, kNN) to identify those with the best predictive capacity. The performance of each combination is evaluated through an empirical study on three benchmark cancer-related micro-array datasets. Our results first suggest that the quality of the data relevant to the target classes is key for the successful classification of cancer phenotypes. We also proved that, for a given classi-fication learning algorithm and dataset, all filters have a similar performance. Interestingly, fil-ters achieve comparable or even better results with respect to the GA-based wrappers, while also being easier to implement and faster. Taken together, our findings suggest that simple, well-established feature selectors in combination with optimized classifiers guarantee good per-formances, with no need for complicated and computationally demanding methodologies

提供机构：

Centro di Ateneo per le Biblioteche dell'Università degli Studi di Padova

创建时间：

2022-08-09

5,000+

优质数据集

54 个

任务类型

进入经典数据集