Heterogeneous Classifier Fusion for Ligand-Based Virtual Screening: Or, How Decision Making by Committee Can Be a Good Thing
收藏NIAID Data Ecosystem2026-03-08 收录
下载链接:
https://figshare.com/articles/dataset/Heterogeneous_Classifier_Fusion_for_Ligand_Based_Virtual_Screening_Or_How_Decision_Making_by_Committee_Can_Be_a_Good_Thing/2350483
下载链接
链接失效反馈官方服务:
资源简介:
The
concept of data fusion - the combination of information from different
sources describing the same object with the expectation to generate
a more accurate representation - has found application in a very broad
range of disciplines. In the context of ligand-based virtual screening
(VS), data fusion has been applied to combine knowledge from either
different active molecules or different fingerprints to improve similarity
search performance. Machine-learning (ML) methods based on fusion
of multiple homogeneous classifiers, in particular random forests,
have also been widely applied in the ML literature. The heterogeneous
version of classifier fusion - fusing the predictions from different
model types - has been less explored. Here, we investigate heterogeneous
classifier fusion for ligand-based VS using three different ML methods,
RF, naı̈ve Bayes (NB), and logistic regression (LR), with
four 2D fingerprints, atom pairs, topological torsions, RDKit fingerprint,
and circular fingerprint. The methods are compared using a previously
developed benchmarking platform for 2D fingerprints which is extended
to ML methods in this article. The original data sets are filtered
for difficulty, and a new set of challenging data sets from ChEMBL
is added. Data sets were also generated for a second use case: starting
from a small set of related actives instead of diverse actives. The
final fused model consistently outperforms the other approaches across
the broad variety of targets studied, indicating that heterogeneous
classifier fusion is a very promising approach for ligand-based VS.
The new data sets together with the adapted source code for ML methods
are provided in the Supporting Information.
创建时间:
2016-02-18



