five

Descriptor-First Approach for ADMET Prediction in the PolarisHub Antiviral Challenge

收藏
NIAID Data Ecosystem2026-05-10 收录
下载链接:
https://figshare.com/articles/dataset/Descriptor-First_Approach_for_ADMET_Prediction_in_the_PolarisHub_Antiviral_Challenge/30866767
下载链接
链接失效反馈
官方服务:
资源简介:
The prediction of absorption, distribution, metabolism, excretion, and toxicity (ADMET) properties remains a central bottleneck in small-molecule discovery. We present the third-place solution from the PolarisHub Antiviral Competition, covering five end points broadly relevant to small-molecule design: human and mouse liver microsomal stability (HLM, MLM), MDR1-MDCKII permeability, kinetic solubility, and lipophilicity (LogD). Rather than pursuing complex machine learning architectures, we adopted a descriptor-first strategy. We systematically curated descriptors and models from ADMET Predictor as meta-features and then applied high-capacity tabular learners. A pretrained foundation model for tabular data (TabPFN), used in single-task regression, consistently outperformed or matched a strong gradient boosting baseline (CatBoost), yielding up to 44% mean absolute error (MAE) reduction across end points while simplifying deployment by eliminating an extensive hyperparameter search and producing compact models. Additionally, we engineered two feature sets that delivered modest gains in randomized cross-validation runs: (i) tuned fragment representations and (ii) site-of-metabolism pattern features. Overall, we used four groups of features: mechanistic, physicochemical, fragment, and metabolic. These results indicate that in practical ADMET modeling scenarios, where rich, validated descriptors are available, the competitive advantages often arise from principled feature engineering combined with robust, rather than overly complex, modeling approaches.
创建时间:
2025-12-11
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作