Composite Dataset of Input and Output Files from Complex Similarity Network Analysis of Secreted Cysteine-Rich peptides/proteins Without Annotation (SCRs-WA)
收藏Mendeley Data2026-04-18 收录
下载链接:
https://data.mendeley.com/datasets/mjnn6kjgkh
下载链接
链接失效反馈官方服务:
资源简介:
This dataset contains a composite collection of bioactive peptide sequences and Complex Similarity Network (CSN) analysis outputs, designed to explore the functional relationships of 1,872 Secreted Cysteine-Rich peptides/proteins Without Annotation (SCRs-WA). The dataset integrates eight peptide classes, including antimicrobial peptides (AMPs), defensins, venoms/toxins, and non-AMP controls, to establish a reference chemical space for functional inference.
It includes both input sequence data (FASTA format) and CSN-derived output files, which facilitate the visualization and clustering of peptide sequences based on structural and functional similarities:
1- FileSM1: FileSM1_12449_All_8_datasets.fasta
📄 Content:
A FASTA file containing 12,449 peptide sequences across eight datasets:
(i) Low-toxicity antimicrobial peptides (AMPs)
(ii) Defensins
(iii) Animal venoms and toxins
(iv) Cytotoxic peptides
(v) Haemolytic peptides
(vi) Non-AMPs (negative controls)
(vii) Cnidarian toxin candidates from S. savaglia
(viii) Secreted Cysteine-Rich ORFs Without Annotation (mSCRs-WA)
🔍 Usage:
- Serves as the primary input dataset for complex similarity network (CSN) analysis.
- Enables homology searches, functional annotation, and comparative analyses.
📤 Output Files from CSN Analysis
2- 🗂 FileSM2: FileSM2_HSPN_Topology_GraphML.zip
📄 Content:
A compressed ZIP file containing GraphML representations of the Half-Space Proximal Network (HSPN):
HSPN_clusters_projection.graphml → Clustered projection of peptide connectivity based on similarity metrics.
HSPN_peptide_classes_projection.graphml → Projection of peptide classes (AMPs, toxins, defensins, etc.), highlighting their network positioning.
🖥 Visualization:
Can be opened in Gephi v0.10 or any GraphML-compatible tool.
Nodes represent peptide sequences, edges indicate functional similarity, and clusters reflect shared bioactivity profiles.
🔍 Usage:
- Facilitates visual exploration of sequence relationships.
- Enables functional annotation transfer by identifying clusters with known bioactive peptides.
3- 🗂 FileSM3: FileSM3_Clusters_Composition_Analysis.xlsx
📄 Content:
A spreadsheet detailing cluster composition in the HSPN analysis, including:
Cluster ID and size
Distribution of peptides across eight datasets
Functional annotation insights for each cluster
🔍 Usage:
- Helps identify key functional groups within the CSN framework.
- Provides quantitative insights into peptide distribution and classification.
4- 🗂 FileSM4: FileSM4_HSPN_Connections_Analysis.xlsx
📄 Content:
A spreadsheet detailing functional connections between peptides, including:
Pairwise similarity scores
Network centrality measures (e.g., harmonic centrality, degree centrality)
Annotations of linked sequences
🔍 Usage:
- Supports similarity-based functional inference.
- Helps track peptide relationships and connectivity patterns within the network.
创建时间:
2025-02-27



