five

Counted Fingerprint-Enhanced Graph Neural Network Models Enable Accurate Screening of hERG Blockers from Diverse Categories of Chemicals

收藏
NIAID Data Ecosystem2026-05-10 收录
下载链接:
https://figshare.com/articles/dataset/Counted_Fingerprint-Enhanced_Graph_Neural_Network_Models_Enable_Accurate_Screening_of_hERG_Blockers_from_Diverse_Categories_of_Chemicals/31568976
下载链接
链接失效反馈
官方服务:
资源简介:
Chemical-induced blockade of the human ether-a-go-go-related gene (hERG) K+ channels may lead to fatal cardiac arrhythmia. Given the ever-increasing number of chemicals, developing in silico models is preferable to time-consuming experimental tests for efficiently screening potential hERG blockers. Nonetheless, the existing models were primarily constructed with structural nondiversity data sets and adopted one type of molecular representation, limiting their prediction accuracy and applicability domain (AD) coverage. Herein, dual feature-based neural network (DFNN) models with fused molecular fingerprint (MF) and molecular graph (MG) features were constructed based on an enlarged hERG blockade data set for high-throughput screening of hERG blockers. The optimal DFNN model achieved average area under the receiver operating characteristic curve, sensitivity, and specificity values of 0.950, 0.844, and 0.909, respectively, outperforming single MF- or MG-based neural network models, MF-based machine learning models, and previous models. ADs of the optimal model were characterized by an advanced structure–activity landscape analysis method. The model with defined ADs was applied to screen over 500,000 industrial chemicals, drugs, and natural products, yielding over 26,000 hERG blocker identifications. The state-of-the-art performance and robustness of the DFNN model underscore the effectiveness of the feature fusion strategy in modeling processes, holding significance for other end points.

化合物诱导的人类ether-à-go-go相关基因(human ether-a-go-go-related gene, hERG)钾离子通道阻断效应,可引发致命性心律失常。鉴于当前化合物数量持续攀升,开发计算机模拟(in silico)模型以替代耗时的实验检测,实现潜在hERG阻断剂的高效筛选,已成为更优方案。然而,现有模型主要基于结构多样性匮乏的数据集构建,且仅采用单一类型的分子表征手段,这限制了模型的预测精度与适用域(applicability domain, AD)的覆盖范围。本研究基于扩增后的hERG阻断数据集,构建了融合分子指纹(molecular fingerprint, MF)与分子图(molecular graph, MG)特征的双特征神经网络(dual feature-based neural network, DFNN)模型,用于hERG阻断剂的高通量筛选。最优DFNN模型的受试者工作特征曲线(receiver operating characteristic curve, ROC)下平均面积、灵敏度与特异度分别达0.950、0.844与0.909,其性能优于单一MF或MG驱动的神经网络模型、基于MF的机器学习模型以及既往同类模型。本研究采用先进的构效景观分析方法,对最优模型的适用域进行了表征。将该带有明确适用域的模型应用于超50万种工业化合物、药物及天然产物的筛选,共鉴定出超过26000种hERG阻断剂。DFNN模型的顶尖性能与稳健性,凸显了特征融合策略在建模流程中的有效性,该策略对其他研究终点亦具有重要借鉴价值。
创建时间:
2026-03-09
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作