five

Composite Dataset of Input and Output Files from Complex Similarity Network Analysis of Secreted Cysteine-Rich peptides/proteins Without Annotation (SCRs-WA)

收藏
Mendeley Data2026-04-18 收录
下载链接:
https://data.mendeley.com/datasets/mjnn6kjgkh
下载链接
链接失效反馈
官方服务:
资源简介:
This dataset contains a composite collection of bioactive peptide sequences and Complex Similarity Network (CSN) analysis outputs, designed to explore the functional relationships of 1,872 Secreted Cysteine-Rich peptides/proteins Without Annotation (SCRs-WA). The dataset integrates eight peptide classes, including antimicrobial peptides (AMPs), defensins, venoms/toxins, and non-AMP controls, to establish a reference chemical space for functional inference. It includes both input sequence data (FASTA format) and CSN-derived output files, which facilitate the visualization and clustering of peptide sequences based on structural and functional similarities: 1- FileSM1: FileSM1_12449_All_8_datasets.fasta 📄 Content: A FASTA file containing 12,449 peptide sequences across eight datasets: (i) Low-toxicity antimicrobial peptides (AMPs) (ii) Defensins (iii) Animal venoms and toxins (iv) Cytotoxic peptides (v) Haemolytic peptides (vi) Non-AMPs (negative controls) (vii) Cnidarian toxin candidates from S. savaglia (viii) Secreted Cysteine-Rich ORFs Without Annotation (mSCRs-WA) 🔍 Usage: - Serves as the primary input dataset for complex similarity network (CSN) analysis. - Enables homology searches, functional annotation, and comparative analyses. 📤 Output Files from CSN Analysis 2- 🗂 FileSM2: FileSM2_HSPN_Topology_GraphML.zip 📄 Content: A compressed ZIP file containing GraphML representations of the Half-Space Proximal Network (HSPN): HSPN_clusters_projection.graphml → Clustered projection of peptide connectivity based on similarity metrics. HSPN_peptide_classes_projection.graphml → Projection of peptide classes (AMPs, toxins, defensins, etc.), highlighting their network positioning. 🖥 Visualization: Can be opened in Gephi v0.10 or any GraphML-compatible tool. Nodes represent peptide sequences, edges indicate functional similarity, and clusters reflect shared bioactivity profiles. 🔍 Usage: - Facilitates visual exploration of sequence relationships. - Enables functional annotation transfer by identifying clusters with known bioactive peptides. 3- 🗂 FileSM3: FileSM3_Clusters_Composition_Analysis.xlsx 📄 Content: A spreadsheet detailing cluster composition in the HSPN analysis, including: Cluster ID and size Distribution of peptides across eight datasets Functional annotation insights for each cluster 🔍 Usage: - Helps identify key functional groups within the CSN framework. - Provides quantitative insights into peptide distribution and classification. 4- 🗂 FileSM4: FileSM4_HSPN_Connections_Analysis.xlsx 📄 Content: A spreadsheet detailing functional connections between peptides, including: Pairwise similarity scores Network centrality measures (e.g., harmonic centrality, degree centrality) Annotations of linked sequences 🔍 Usage: - Supports similarity-based functional inference. - Helps track peptide relationships and connectivity patterns within the network.
创建时间:
2025-02-27
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作