Semisupervised Learning to Boost hERG, Nav1.5, and Cav1.2 Cardiac Ion Channel Toxicity Prediction by Mining a Large Unlabeled Small Molecule Data Set
收藏NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://figshare.com/articles/dataset/Semisupervised_Learning_to_Boost_hERG_Nav1_5_and_Cav1_2_Cardiac_Ion_Channel_Toxicity_Prediction_by_Mining_a_Large_Unlabeled_Small_Molecule_Data_Set/26516034
下载链接
链接失效反馈官方服务:
资源简介:
Predicting drug toxicity is a critical aspect of ensuring
patient
safety during the drug design process. Although conventional machine
learning techniques have shown some success in this field, the scarcity
of annotated toxicity data poses a significant challenge in enhancing
models’ performance. In this study, we explore the potential
of leveraging large unlabeled small molecule data sets using semisupervised
learning to improve drug cardiotoxicity predictive performance across
three cardiac ion channel targets: the voltage-gated potassium channel
(hERG), the voltage-gated sodium channel (Nav1.5), and the voltage-gated
calcium channel (Cav1.2). We extensively mined the ChEMBL database,
comprising approximately 2 million small molecules, and then employed
semisupervised learning to construct robust classification models
for this purpose. We achieved a performance boost on highly diverse
(i.e., structurally dissimilar) test data sets across all three targets.
Using our built models, we screened the whole ChEMBL database and
a large set of FDA-approved drugs, identifying several compounds with
potential cardiac ion channel activity. To ensure broad accessibility
and usability for both technical and nontechnical users, we developed
a cross-platform graphical user interface that allows users to make
predictions and gain insights into the cardiotoxicity of drugs and
other small molecules. The software is made available as open source
under the permissive MIT license at https://github.com/issararab/CToxPred2.
创建时间:
2024-08-07



