five

Comparative Studies on Some Metrics for External Validation of QSPR Models

收藏
NIAID Data Ecosystem2026-03-07 收录
下载链接:
https://figshare.com/articles/dataset/Comparative_Studies_on_Some_Metrics_for_External_Validation_of_QSPR_Models/2546854
下载链接
链接失效反馈
官方服务:
资源简介:
Quantitative structure–property relationship (QSPR) models used for prediction of property of untested chemicals can be utilized for prioritization plan of synthesis and experimental testing of new compounds. Validation of QSPR models plays a crucial role for judgment of the reliability of predictions of such models. In the QSPR literature, serious attention is now given to external validation for checking reliability of QSPR models, and predictive quality is in the most cases judged based on the quality of predictions of property of a single test set as reflected in one or more external validation metrics. Here, we have shown that a single QSPR model may show a variable degree of prediction quality as reflected in some variants of external validation metrics like Q2F1, Q2F2, Q2F3, CCC, and rm2 (all of which are differently modified forms of predicted variance, which theoretically may attain a maximum value of 1), depending on the test set composition and test set size. Thus, this report questions the appropriateness of the common practice of the “classic” approach of external validation based on a single test set and thereby derives a conclusion about predictive quality of a model on the basis of a particular validation metric. The present work further demonstrates that among the considered external validation metrics, rm2 shows statistically significantly different numerical values from others among which CCC is the most optimistic or less stringent. Furthermore, at a given level of threshold value of acceptance for external validation metrics, rm2 provides the most stringent criterion (especially with Δrm2 at highest tolerated value of 0.2) of external validation, which may be adopted in the case of regulatory decision support processes.

定量构效关系(Quantitative structure–property relationship, QSPR)模型可用于预测未测试化学品的理化性质,亦能为新化合物的合成与实验测试优先级规划提供决策参考。QSPR模型的验证,是评判此类模型预测可靠性的核心环节。在QSPR研究领域,当前学界已高度重视用于验证模型可靠性的外部验证方法,且多数情况下,模型的预测性能均通过单一测试集的性质预测质量进行评估,该评估依托一项或多项外部验证指标得以体现。本研究表明,单一QSPR模型的预测性能会因测试集的组成与规模差异,在Q2F1、Q2F2、Q2F3、CCC及rm2等多款外部验证指标的反映下呈现出显著差异——上述指标均为预测方差的不同修正形式,理论上最高取值为1。因此,本研究对“基于单一测试集的经典外部验证范式”的合理性提出质疑,并指出仅依靠单一验证指标判定模型预测性能的常规做法存在局限性。本研究进一步证实,在本次研究涉及的多款外部验证指标中,rm2的数值与其余指标均存在统计学意义上的显著差异,其中CCC指标最为宽松,即对模型预测性能的评判标准最为乐观。此外,当为外部验证指标设定统一的可接受阈值时,rm2指标的验证标准最为严格(尤其是当Δrm2的最高可接受阈值为0.2时),该指标可应用于监管决策支持相关场景。
创建时间:
2012-02-27
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作