How Precise Are Our Quantitative Structure–Activity Relationship Derived Predictions for New Query Chemicals?
收藏NIAID Data Ecosystem2026-03-10 收录
下载链接:
https://figshare.com/articles/dataset/How_Precise_Are_Our_Quantitative_Structure_Activity_Relationship_Derived_Predictions_for_New_Query_Chemicals_/7106102
下载链接
链接失效反馈官方服务:
资源简介:
Quantitative
structure–activity relationship (QSAR) models
have long been used for making predictions and data gap filling in
diverse fields including medicinal chemistry, predictive toxicology,
environmental fate modeling, materials science, agricultural science,
nanoscience, food science, and so forth. Usually a QSAR model is developed
based on chemical information of a properly designed training set
and corresponding experimental response data while the model is validated
using one or more test set(s) for which the experimental response
data are available. However, it is interesting to estimate the reliability
of predictions when the model is applied to a completely new data
set (true external set) even when the new data points are within applicability
domain (AD) of the developed model. In the present study, we have
categorized the quality of predictions for the test set or true external
set into three groups (good, moderate, and bad) based on absolute
prediction errors. Then, we have used three criteria [(a) mean absolute
error of leave-one-out predictions for 10 most close training compounds
for each query molecule; (b) AD in terms of similarity based on the
standardization approach; and (c) proximity of the predicted value
of the query compound to the mean training response] in different
weighting schemes for making a composite score of predictions. It
was found that using the most frequently appearing weighting scheme
0.5–0–0.5, the composite score-based categorization
showed concordance with absolute prediction error-based categorization
for more than 80% test data points while working with 5 different
datasets with 15 models for each set derived in three different splitting
techniques. These observations were also confirmed with true external
sets for another four endpoints suggesting applicability of the scheme
to judge the reliability of predictions for new datasets. The scheme
has been implemented in a tool “Prediction Reliability Indicator”
available at http://dtclab.webs.com/software-tools and http://teqip.jdvu.ac.in/QSAR_Tools/DTCLab/, and the tool is presently valid for multiple linear regression
models only.
创建时间:
2018-09-19



