AMPL: A Data-Driven Modeling Pipeline for Drug Discovery
收藏NIAID Data Ecosystem2026-03-11 收录
下载链接:
https://figshare.com/articles/dataset/AMPL_A_Data-Driven_Modeling_Pipeline_for_Drug_Discovery/12137388
下载链接
链接失效反馈官方服务:
资源简介:
One of the key requirements for incorporating
machine learning (ML) into the drug discovery process is complete
traceability and reproducibility of the model building and evaluation
process. With this in mind, we have developed an end-to-end modular
and extensible software pipeline for building and sharing ML models
that predict key pharma-relevant parameters. The ATOM Modeling PipeLine,
or AMPL, extends the functionality of the open source library DeepChem
and supports an array of ML and molecular featurization tools. We
have benchmarked AMPL on a large collection of pharmaceutical data
sets covering a wide range of parameters. Our key findings indicate
that traditional molecular fingerprints underperform other feature
representation methods. We also find that data set size correlates
directly with prediction performance, which points to the need to
expand public data sets. Uncertainty quantification can help predict
model error, but correlation with error varies considerably between
data sets and model types. Our findings point to the need for an extensible
pipeline that can be shared to make model building more widely accessible
and reproducible. This software is open source and available at: https://github.com/ATOMconsortium/AMPL.
创建时间:
2020-04-27



