Comparative Studies on Some Metrics for External Validation of QSPR Models

NIAID Data Ecosystem2026-03-07 收录

下载链接：

https://figshare.com/articles/dataset/Comparative_Studies_on_Some_Metrics_for_External_Validation_of_QSPR_Models/2546806

下载链接

链接失效反馈

官方服务：

资源简介：

Quantitative structure–property relationship (QSPR) models used for prediction of property of untested chemicals can be utilized for prioritization plan of synthesis and experimental testing of new compounds. Validation of QSPR models plays a crucial role for judgment of the reliability of predictions of such models. In the QSPR literature, serious attention is now given to external validation for checking reliability of QSPR models, and predictive quality is in the most cases judged based on the quality of predictions of property of a single test set as reflected in one or more external validation metrics. Here, we have shown that a single QSPR model may show a variable degree of prediction quality as reflected in some variants of external validation metrics like Q2F1, Q2F2, Q2F3, CCC, and rm2 (all of which are differently modified forms of predicted variance, which theoretically may attain a maximum value of 1), depending on the test set composition and test set size. Thus, this report questions the appropriateness of the common practice of the “classic” approach of external validation based on a single test set and thereby derives a conclusion about predictive quality of a model on the basis of a particular validation metric. The present work further demonstrates that among the considered external validation metrics, rm2 shows statistically significantly different numerical values from others among which CCC is the most optimistic or less stringent. Furthermore, at a given level of threshold value of acceptance for external validation metrics, rm2 provides the most stringent criterion (especially with Δrm2 at highest tolerated value of 0.2) of external validation, which may be adopted in the case of regulatory decision support processes.

定量构效关系（Quantitative structure–property relationship, QSPR）模型可用于预测未测试化学品的性质，亦能为新化合物的合成优先级规划与实验测试方案制定提供支撑。QSPR模型的验证对于评估该类模型预测结果的可靠性具有关键作用。在QSPR研究领域，当前学界已高度重视用于验证QSPR模型可靠性的外部验证方法，且多数情况下，模型的预测性能均通过单一测试集的性质预测质量进行评估，该评估结果体现于一项或多项外部验证指标中。本研究表明，单一QSPR模型的预测性能会因测试集构成与规模的不同，在Q2F1、Q2F2、Q2F3、CCC以及rm2等多种外部验证指标中呈现出不同程度的差异；上述指标均为预测方差的不同修正形式，理论上其最大值可达1。因此，本研究对基于单一测试集与单一验证指标来评估模型预测性能的“经典”外部验证常规做法的合理性提出了质疑。本研究进一步证实，在本次研究涉及的外部验证指标中，rm2的数值与其余指标均存在统计学意义上的显著差异，其中CCC指标的评估结果最为乐观，即判定标准最为宽松。此外，当为外部验证指标设定统一的可接受阈值时，rm2所对应的外部验证标准最为严格（尤其是当Δrm2的最高耐受值为0.2时），该标准可应用于监管决策支持相关场景。

创建时间：

2012-02-27

5,000+

优质数据集

54 个

任务类型

进入经典数据集