five

Data associated with the article entitled Plasmepsin-like aspartyl proteases in Babesia

收藏
Mendeley Data2024-03-27 更新2024-06-26 收录
下载链接:
https://data.mendeley.com/datasets/ds3f2j32ny
下载链接
链接失效反馈
官方服务:
资源简介:
The dataset used for the phylogenetic analysis comprised 106 aspartyl protease (AP) protein sequences of representatives from the phylum Apicomplexa and related Vitrella and Chromera spp. All sequences were retrieved either from GenBank or EupathDB using blastp and tblastn BLAST algorithms and E-value cutoff 10-5. Alignment was constructed in Geneious Prime 2020.1.2. using MAFFT v7.017 with the default parameters for the gap opening penalty (1.53) and the offset value (0.123). The protein sequences were crosschecked for the presence of DTG/DTG or DTG/DSG aspartyl protease motifs. Poorly aligned N- (signal peptide included) and C-termini were manually trimmed which resulted in the final alignment comprising 324 amino acid positions.

本系统发育分析所用数据集包含106条天冬氨酸蛋白酶(aspartyl protease, AP)蛋白序列,这些序列取自顶复门(Apicomplexa)以及相关类群Vitrella属和Chromera属的代表物种。所有序列均借助BLAST(Basic Local Alignment Search Tool)算法的blastp与tblastn程序,以10^-5作为E值阈值,从GenBank(基因银行)与真核病原体数据库(Eukaryotic Pathogen Database, EupathDB)中检索获得。序列比对工作在Geneious Prime 2020.1.2(基因助手Prime 2020.1.2)中完成,采用MAFFT v7.017工具,使用其默认参数:缺口开放罚分(gap opening penalty)为1.53,偏移值(offset value)为0.123。随后对所有蛋白序列进行交叉核查,确认其是否含有DTG/DTG或DTG/DSG型天冬氨酸蛋白酶基序(motif)。最后对比对质量不佳的N端(包含信号肽)与C端序列进行手动修剪,最终得到包含324个氨基酸位点的比对结果。
创建时间:
2024-01-23
二维码
社区交流群
二维码
科研交流群
商业服务