five

RF model predictions based on a negative dataset.

收藏
Figshare2025-07-11 更新2026-04-28 收录
下载链接:
https://figshare.com/articles/dataset/RF_model_predictions_based_on_a_negative_dataset_/29548447
下载链接
链接失效反馈
官方服务:
资源简介:
Crude cone snail venom is a rich source of bioactive compounds with significant therapeutic potential. In this study, we conducted a comprehensive analysis of 5,985 cone snail peptides across 82 Conus species to identify unique cysteine (Cys) patterns and associated frameworks. The classification of these Cys patterns, based on conserved framework combinations, enabled the generation of species-level pattern barcodes. These barcodes were then evaluated to assess the species correlations of individual sequences. By analyzing 151 known Conus peptide PDB files, we computed Cys disulfide linkages to assess overall stability profiles. Incorporating barcode data allowed us to filter the dataset and prepare it for machine learning (ML) processing. Random Forest (RF) modeling, a supervised learning technique, was used to predict the therapeutic potential of venom peptides. Feature extraction was based on known venom-derived approved peptide-based drugs. The dataset was split into a 70:30 train-test ratio. A total of 6,430 peptides (5,985 from cone snails and 445 from other venomous species) were used to evaluate model prediction capability. The proposed model achieved ideal accuracy (90.48%) in peptide therapeutic classification. Subsequent model outputs underwent further structural and binding pattern analysis against known targets, revealing significant similarities between the binding patterns of approved and novel peptides. The model’s performance could be further enhanced by incorporating additional datasets and optimizing feature selection, potentially broadening its applicability to larger peptide datasets. Overall, this study underscores the potential of ML in advancing pharmacological research on diverse venom peptides.
创建时间:
2025-07-11
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作