five

Half-Space Proximal Networks (HSPNs): A Proxy for Multi-Query Similarity Searching Models Predicting Tumor-Homing Peptides

收藏
NIAID Data Ecosystem2026-05-10 收录
下载链接:
https://figshare.com/articles/dataset/Half-Space_Proximal_Networks_HSPNs_A_Proxy_for_Multi-Query_Similarity_Searching_Models_Predicting_Tumor-Homing_Peptides/30543557
下载链接
链接失效反馈
官方服务:
资源简介:
Tumor-homing peptides (THPs) have emerged as promising agents in cancer treatments. These short sequences can specifically target tumor cells and vasculature. Here, a nontrained machine learning (ML) method based on network science and multiquery similarity searching to predict THPs is presented. We leverage the network-based representation of THPs’ chemical space to extract valuable information by employing a novel similarity-based, yet sparse, network known as the half-space proximal network (HSPN). The HSPN of the THPs’ giant component is composed of 12 communities that represent distinct modes of action and/or targets, as well as sequence templates (scaffolds). In the HSPN analysis, various centrality measures were employed to identify the most significant and nonredundant THPs. These central THPs were then used as queries (Qs) in group fusion similarity-based searches against an established collection of known THPs. The performance of the resulting multiquery similarity-based search models (MQSSMs) was assessed using three benchmarking datasets of THPs/non-THPs. The MQSSMs derived from the HSPNs (THP2) demonstrated superior discrimination performance compared to the classical chemical space networks (CSNs, namely THP1) when applied to the THPs/non-THPs datasets Remarkably, exceptional MCC values (>0.887) were achieved when utilizing Qs from both CSN and HSPN networks to construct MQSSMs (THP3), employing a similarity threshold of 0.6, in external datasets. Next, we conducted a statistical comparison between the performance of our top-performing MQSSM, THP3, and several THP prediction servers, including TumorHPD, THPep, SCMTHP, and NEPTUNE. Our proposed model demonstrated its superiority by surpassing the state-of-the-art supervised and trained ML methods for THP prediction with statistically significant differences. These results provide strong evidence that network-based similarity searches are highly effective and reliable for identifying THPs.
创建时间:
2025-11-05
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作