Isometric Stratified Ensembles: A Partial and Incremental Adaptive Applicability Domain and Consensus-Based Classification Strategy for Highly Imbalanced Data Sets with Application to Colloidal Aggregation
收藏NIAID Data Ecosystem2026-03-13 收录
下载链接:
https://figshare.com/articles/dataset/Isometric_Stratified_Ensembles_A_Partial_and_Incremental_Adaptive_Applicability_Domain_and_Consensus-Based_Classification_Strategy_for_Highly_Imbalanced_Data_Sets_with_Application_to_Colloidal_Aggregation/19487109
下载链接
链接失效反馈官方服务:
资源简介:
Partial
and incremental stratification analysis of a quantitative
structure-interference relationship (QSIR) is a novel strategy intended
to categorize classification provided by machine learning techniques.
It is based on a 2D mapping of classification statistics onto two
categorical axes: the degree of consensus and level of applicability
domain. An internal cross-validation set allows to determine the statistical
performance of the ensemble at every 2D map stratum and hence to define
isometric local performance regions with the aim of better hit ranking
and selection. During training, isometric stratified ensembles (ISE)
applies a recursive decorrelated variable selection and considers
the cardinal ratio of classes to balance training sets and thus avoid
bias due to possible class imbalance. To exemplify the interest of
this strategy, three different highly imbalanced PubChem pairs of
AmpC β-lactamase and cruzain inhibition assay campaigns of colloidal
aggregators and complementary aggregators data set available at the AGGREGATOR ADVISOR predictor web page were employed. Statistics
obtained using this new strategy show outperforming results compared
to former published tools, with and without a classical applicability
domain. ISE performance on classifying colloidal aggregators shows
from a global AUC of 0.82, when the whole test data set is considered,
up to a maximum AUC of 0.88, when its highest confidence isometric
stratum is retained.
创建时间:
2022-03-31



