CARA: Benchmarking Compound Activity Prediction for Real-World Drug Discovery Applications
收藏NIAID Data Ecosystem2026-05-01 收录
下载链接:
https://zenodo.org/record/11063964
下载链接
链接失效反馈官方服务:
资源简介:
Identifying active compounds for target proteins is fundamental in early drug discovery. Recently, data-driven computational methods have demonstrated promising potential in predicting compound activities. However, there lacks a well-designed benchmark to comprehensively evaluate these methods from a practical perspective. To fill this gap, we propose a benchmark, named CARA.Through carefully distinguishing assay types, designing train-test splitting schemes and selecting evaluation metrics, CARA can consider the biased distribution of current real-world compound activity data and avoid overestimation of model performances. We observed that current models can make successful predictions for certain proportions of assays, while the performances varied across different assays. In addition, evaluation of several few-shot training strategies demonstrated different performances related to task types. Overall, we provide a high-quality dataset for developing and evaluating compound activity prediction models, and the analyses in this work may inspire better applications of data-driven models in drug discovery.
创建时间:
2024-04-25



