five

Most Ligand-Based Classification Benchmarks Reward Memorization Rather than Generalization

收藏
Figshare2018-05-08 更新2026-04-29 收录
下载链接:
https://figshare.com/articles/dataset/Most_Ligand-Based_Classification_Benchmarks_Reward_Memorization_Rather_than_Generalization/6231674
下载链接
链接失效反馈
官方服务:
资源简介:
Undetected overfitting can occur when there are significant redundancies between training and validation data. We describe AVE, a new measure of training–validation redundancy for ligand-based classification problems, that accounts for the similarity among inactive molecules as well as active ones. We investigated seven widely used benchmarks for virtual screening and classification, and we show that the amount of AVE bias strongly correlates with the performance of ligand-based predictive methods irrespective of the predicted property, chemical fingerprint, similarity measure, or previously applied unbiasing techniques. Therefore, it may be the case that the previously reported performance of most ligand-based methods can be explained by overfitting to benchmarks rather than good prospective accuracy.
创建时间:
2018-05-08
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作