five

The Ensemble Bridge Algorithm: A New Modeling Tool for Drug Discovery Problems

收藏
NIAID Data Ecosystem2026-03-06 收录
下载链接:
https://figshare.com/articles/dataset/The_Ensemble_Bridge_Algorithm_A_New_Modeling_Tool_for_Drug_Discovery_Problems/2790187
下载链接
链接失效反馈
官方服务:
资源简介:
Ensemble algorithms have been historically categorized into two separate paradigms, boosting and random forests, which differ significantly in the way each ensemble is constructed. Boosting algorithms represent one extreme, where an iterative greedy optimization strategy, weak learners (e.g., small classification trees), and stage weights are employed to target difficult-to-classify regions in the training space. On the other extreme, random forests rely on randomly selected features and complex learners (learners that exhibit low bias, e.g., large regression trees) to classify well over the entire training data. Because the approach is not targeting the next learner for inclusion, it tends to provide a natural robustness to noisy labels. In this work, we introduce the ensemble bridge algorithm, which is capable of transitioning between boosting and random forests using a regularization parameter ν ∈ [0,1]. Because the ensemble bridge algorithm is a compromise between the greedy nature of boosting and the randomness present in random forests, it yields robust performance in the presence of a noisy response and superior performance in the presence of a clean response. Often, drug discovery data (e.g., computational chemistry data) have varying levels of noise. Hence, this method enables a practitioner to employ a single method to evaluate ensemble performance. The method’s robustness is verified across a variety of data sets where the algorithm repeatedly yields better performance than either boosting or random forests alone. Finally, we provide diagnostic tools for the new algorithm, including a measure of variable importance and an observational clustering tool.
创建时间:
2010-02-22
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作