Protein data
收藏Mendeley Data2026-04-18 收录
下载链接:
https://data.mendeley.com/datasets/6rnzgxxrzt
下载链接
链接失效反馈官方服务:
资源简介:
This folder contains 10 directories, one for each taxon: Bacillales, Bacteroidales, Corynebacteriales, Enterobacterales, Escherichia, Hyphomicrobiales, Methanococcales,Pseudomonadales, Sulfolobales, Thermococcales.
Each directory contains 2 sub-directories:
1. Protein family data
• Each protein family is named according to the PDB ID and the RefSeq Prot ID (PDBID_ProtID) of the protein used as seed to assemble the protein family.
• For each protein family the following files are provided:
• PDBID_ProtID.pdb file corresponds to the structure from the PDB of the protein used as seed to build protein families,
• PDBID_ProtID.fst file contains the sequence of the protein used as seed and its homologues,
• PDBID_ProtID.mafft file corresponds to the multiple alignment of the protein family obtained with MAFFT,
• PDBID_ProtID.mafft.BMGE45.fst file corresponds to the multiple alignment trimmed with BMGE,
• PDBID_ProtID.mafft.BMGE45.fst.treefile file corresponds to the maximum likelihood tree of the family inferred with IQ-TREE.
2. Alphafold2 predictions
• Each protein family is named according to the PDB ID and the RefSeq Prot ID (PDBID_ProtID) of the protein used as seed to assemble the protein family.
• Each protein family file contains a number of sub-directories equal to the number of sequences in the protein family. The name of each such sub-directory is the number of the sequence in order of appearance in the PDBID_ProtID.fst file.
• Each sub-directory contains the ranked_0.pdb file corresponding to the best structure predicted by Alphafold2.
创建时间:
2024-04-05



