Supplementary material for "Surface frustration re-patterning underlies the structural landscape and evolvability of fungal orphan candidate effectors"
收藏NIAID Data Ecosystem2026-03-14 收录
下载链接:
https://zenodo.org/record/7506580
下载链接
链接失效反馈官方服务:
资源简介:
Tables
Table S1. List of fungal genomes analyzed in this work, associated references and properties.
Table S2. List of all secreted proteins less than 300 amino-acids from the 20 fungal genomes. The table includes Signalp4.0 output, mature sequence, Espritz % disorder, pfam domains, AlphaFold top prediction pLDDT and the associated pdb file in Dataset S1.
Table S3. Top Hits to pdb database for all OCE structures. 'network_node_name' corresponds to the portein identifier in the OCE structure similarity network provided in Dataset S3. 'Hidef_raw_community' corresponds to groups of structural OCE analogs identified by HiDEF community detection performed on the network provided in Dataset S3.
Table S4. Table S4. List of the 62 major OCE folds with associated statistics. Columns I to AB provide the number of occurrences per species. Note that the actual number of members per species might be underestimated due to the stringent pipeline used for OCE identification (excluding proteins larger than 300 amino acids or containing PFAMs for instance).
Table S5. Relative surface exposure, conformational flexibility and conservation data mapped on residues of members of the Alt-A1 and BoNT families. RMSD, root mean square deviation for all aligned atoms; Conservation, percentage conservation in multiple structure alignment.
Table S6. Assignment of NCBI accessions to MMseqs clusters and assignment of MMseqs clusters to HMM matching-based super-clusters.
Table S7. Co-mutation occurrences and associated p-values in two OCE clades from the Alt-A1 and KP6 families.
Table S8. Amino acid properties inferred from mutation scans and frustration analyses in Alt-A1 cluster yellow1 and KP6 cluster 43. 'Number of aa variants' corresponds to the number of different amino acids found at each position (deletion counts as 1). 'Alanine scan ∆Z' and 'Deletion scan ∆Z' correspond to the difference between Z-score for the native protein agains itself and Z-score for the native protein against mutant at each position (either Alanine replacement or 5-aa deletion). 'Destabilization factor' is the average of column E and F. 'Stabilization factor' corresponds to the difference between expected structural variation due to destabilization factor and the observed structural variation in multiple mutants. 'netEffect' is difference between column G and H. 'Max co-mutation %' is the highest frequency of co-mutation observed with other residues in natural variants, with 'Min co-mutation p-value (Bonferroni corrected)' the associated p-value.Table S9. Sequence and delta Z of natural variants and mutants from AA1_cl25
Table S9. List of natural variants and in silico mutants from the Alt-A1 cluster 25 analyzed in this work, including protein sequence and structure comparison scores (comparison with the reconstructed clade ancestor n0).
Table S10. List of natural variants and in silico mutants from the KP6 cluster 43 analyzed in this work, including protein sequence and structure comparison scores (comparison with the reconstructed clade ancestor n0).
Table S11. Summary statistics for the phylogenetic trees of 15 OCE clades analyzed for structure and frustration evolution.
Table S12. Mapping of structural and frustration data onto phylogenetic trees for 15 OCE clades. The corresponding trees and protein structures are provided in Dataset S7.
Datasets
Dataset S1. AlphaFold rank1 models for 3 927 OCEs (.pdb format).
Dataset S2. Pairwise structure comparison for 3 911 OCE. DALI matrix output containing pairwise Z-scores.
Dataset S3. Network file including 2 561 OCEs with 3 or more vertices of Z-score weight 5.2 or more, in .sif and .xgmml formats.
Dataset S4. Videos illustrating the mapping of relative surface exposure and structural variability in Alt-A1 and BoNT groups, amino-acids conservation, co-selected mutation patches and residue net stabilization effects on Alt-A1 clade 25 ancestor and KP6 cluster 43 ancestor. Color scales are as in Figure 2 and 3 respectively (.mp4 format).
Dataset S5. Phylogenetic trees (.nwk), ancestral (.fasta) and modern variant (.faa) sequences, and AlphaFold best protein models (.pdb) for members of KP6 cluster 43 and Alt-A1 cluster 25. The archive includes 140 Alt-A1 protein structure and 128 KP6 protein structures.
Dataset S6. Best predicted structures for 917 natural variants and mutants of AA1_cl25 and 801 natural variants and mutants of KP6_cl43 (.pdb format).
Dataset S7. Phylogenetic trees (.nwk) and AlphaFold best protein models (.pdb) for 15 OCE clades. The file includes 2 598 protein structures distributed from clades AA1_s (139), AA1_t (135), AA1_y1 (140), AA1_y2 (90), AA1_y3 (128), BoNT_s (291), CIP_s (167), CIP_t (231), crystallin (233), GNK2 (189), KP6_cl3 (203), KP6_cl26 (111), KP6_cl43 (123), KP6_cl96 (231), KP6_cl242 (187).
Text and Figures
Text S1. Contains supplementary methods, results and figures S1 to S13.
创建时间:
2023-01-08



