Trade-off Predictivity and Explainability for Machine-Learning Powered Predictive Toxicology: An in-Depth Investigation with Tox21 Data Sets
收藏NIAID Data Ecosystem2026-03-12 收录
下载链接:
https://figshare.com/articles/dataset/Trade-off_Predictivity_and_Explainability_for_Machine-Learning_Powered_Predictive_Toxicology_An_in-Depth_Investigation_with_Tox21_Data_Sets/13667538
下载链接
链接失效反馈官方服务:
资源简介:
Selecting a model in predictive toxicology
often involves a trade-off
between prediction performance and explainability: should we sacrifice
the model performance to gain explainability or vice versa. Here we
present a comprehensive study to assess algorithm and feature influences
on model performance in chemical toxicity research. We conducted over
5000 models for a Tox21 bioassay data set of 65 assays and ∼7600
compounds. Seven molecular representations as features and 12 modeling
approaches varying in complexity and explainability were employed
to systematically investigate the impact of various factors on model
performance and explainability. We demonstrated that end points dictated
a model’s performance, regardless of the chosen modeling approach
including deep learning and chemical features. Overall, more complex
models such as (LS-)SVM and Random Forest performed marginally better
than simpler models such as linear regression and KNN in the presented
Tox21 data analysis. Since a simpler model with acceptable performance
often also is easy to interpret for the Tox21 data set, it clearly
was the preferred choice due to its better explainability. Given that
each data set had its own error structure both for dependent and independent
variables, we strongly recommend that it is important to conduct a
systematic study with a broad range of model complexity and feature
explainability to identify model balancing its predictivity and explainability.
创建时间:
2021-01-29



