five

SepPCNET: Deeping Learning on a 3D Surface Electrostatic Potential Point Cloud for Enhanced Toxicity Classification and Its Application to Suspected Environmental Estrogens

收藏
NIAID Data Ecosystem2026-03-12 收录
下载链接:
https://figshare.com/articles/dataset/SepPCNET_Deeping_Learning_on_a_3D_Surface_Electrostatic_Potential_Point_Cloud_for_Enhanced_Toxicity_Classification_and_Its_Application_to_Suspected_Environmental_Estrogens/14939182
下载链接
链接失效反馈
官方服务:
资源简介:
Deep learning (DL) offers an unprecedented opportunity to revolutionize the landscape of toxicity prediction based on quantitative structure–activity relationship (QSAR) studies in the big data era. However, the structural description in the reported DL-QSAR models is still restricted to the two-dimensional level. Inspired by point clouds, a type of geometric data structure, a novel three-dimensional (3D) molecular surface point cloud with electrostatic potential (SepPC) was proposed to describe chemical structures. Each surface point of a chemical is assigned its 3D coordinate and molecular electrostatic potential. A novel DL architecture SepPCNET was then introduced to directly consume unordered SepPC data for toxicity classification. The SepPCNET model was trained on 1317 chemicals tested in a battery of 18 estrogen receptor-related assays of the ToxCast program. The obtained model recognized the active and inactive chemicals at accuracies of 82.8 and 88.9%, respectively, with a total accuracy of 88.3% on the internal test set and 92.5% on the external test set, which outperformed other up-to-date machine learning models and succeeded in recognizing the difference in the activity of isomers. Additional insights into the toxicity mechanism were also gained by visualizing critical points and extracting data-driven point features of active chemicals.

深度学习(Deep Learning, DL)在大数据时代为基于定量构效关系(quantitative structure–activity relationship, QSAR)研究的毒性预测领域格局革新带来了前所未有的机遇。然而,现有已报道的DL-QSAR模型中的结构描述仍局限于二维层面。受一类几何数据结构——点云——的启发,本研究提出了一种全新的带静电势三维(3D)分子表面点云(SepPC),用以描述化学分子结构:每个化学分子的表面点均被赋予其三维坐标与分子静电势。随后本研究引入了全新的深度学习架构SepPCNET,可直接处理无序的SepPC数据以开展毒性分类任务。SepPCNET模型在ToxCast项目的18项雌激素受体相关检测组合所测试的1317种化学物质上完成训练。所得模型对活性与非活性化学物质的识别准确率分别为82.8%与88.9%;内部测试集总准确率达88.3%,外部测试集总准确率为92.5%,其性能优于当前其他主流机器学习模型,且能够有效区分同分异构体的活性差异。此外,通过可视化关键点位并提取活性化学物质的数据驱动点特征,本研究还进一步揭示了毒性机制的相关深层见解。
创建时间:
2021-07-09
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作