Untrimmed and trimmed alignments (and selected alignments depicting N-terminal extensions) and phylogenetic trees in nexus and pdf format for plastid-associated and other proteins in eleftherids, Ichthyodinida and psammosids.
收藏DataCite Commons2025-04-01 更新2024-08-18 收录
下载链接:
https://figshare.com/articles/dataset/Untrimmed_and_trimmed_alignments_and_selected_alignments_depicting_N-terminal_extensions_and_phylogenetic_trees_in_nexus_and_pdf_format_for_plastid-associated_and_other_proteins_in_eleftherids_Ichthyodinida_and_psammosids_/19351700/1
下载链接
链接失效反馈官方服务:
资源简介:
The dataset contains alignments (untrimmed and trimmed) and the corresponding phylogenetic trees (in nexus and pdf format) for plastid metabolic pathways (isoprenoid and FeS cluster biosynthesis; heme biosynthesis) and the lysine biosynthesis pathway enzyme DapL in eleftherids, Ichthyodinida and psammosids. For proteins with N-terminal extensions in eleftherids and/or Ichthyodinida, manually curated alignments to showcase putative plastid-targeting sequences are included. For selected heme proteins, curated alignments for the plastidial and cytosolic clades are included. The file names include the name of the protein and, in case of alignment (*.fasta) files, it is indicated whether the alignment is untrimmed or trimmed; for the tree files (*.tre and *.tre.pdf) the file name contains the protein name and the name of the model used for tree reconstruction.The untrimmed alignments have been filtered using PREQUAL with default options, before being aligned with MAFFT G-INS-i using the VSM option (--unalignlevel 0.6). The alignments were then subjected to Divvier using the -mincol 4 and the -divvygap option before trimming with trimAl (-gt 0.01). Trees were calculated with IQ-TREE, using the -mset option to restrict model selection to LG for ModelFinder, while branch support was assessed with 1000 ultrafast bootstrap replicates.
本数据集包含针对埃夫特虫类(Eleftherids)、鱼孢虫目(Ichthyodinida)及砂壳虫类(Psammosids)的质体代谢通路(类异戊二烯与铁硫簇生物合成、血红素生物合成)以及赖氨酸生物合成通路酶DapL的多序列比对(alignment,分为未修剪与已修剪两种)及其对应的系统发育树(文件格式为NEXUS与PDF)。针对埃夫特虫类和/或鱼孢虫目中带有N端延伸序列的蛋白质,本数据集纳入了经人工整理校正的多序列比对,用于展示推定的质体靶向序列。针对部分选定的血红素蛋白质,本数据集还提供了针对质体进化枝与胞质进化枝的整理后多序列比对。
文件名包含对应蛋白质的名称:对于多序列比对文件(*.fasta),文件名会标注该比对为未修剪还是已修剪;对于系统发育树文件(*.tre与*.tre.pdf),文件名则包含蛋白质名称与用于树重建的进化模型名称。
未修剪的多序列比对首先使用PREQUAL软件以默认参数进行过滤,随后通过MAFFT G-INS-i算法结合VSM参数(--unalignlevel 0.6)完成序列比对。比对完成后,使用Divvier软件并设置-mincol 4与-divvygap参数进行处理,再通过trimAl软件以参数-gt 0.01完成修剪。
系统发育树通过IQ-TREE软件构建,使用-mset参数将ModelFinder的模型选择范围限定为LG模型,同时通过1000次超快速自展重复评估分支支持度。
提供机构:
figshare
创建时间:
2023-10-31



