101 flagellate phylogenomics data
收藏Figshare2024-01-12 更新2026-04-28 收录
下载链接:
https://figshare.com/articles/dataset/101_flagellate_phylogenomics_data/22148027
下载链接
链接失效反馈官方服务:
资源简介:
Phylogenomics dataset and the generated transcriptomic data for the study of 7 ancyromonads, 14 apusomonads and Meteora sporadica CRO19MET.Markers and supermatrices: phylogenomics_101_flagellates_97171aa.tar.gzRaw transcripts and peptides used for phylogenomics: 22_transcriptomes_brut.tar.gzTranscripts and peptides without cross-contamination due to batch extraction/sequencing: 22_transcriptomes_croco.tar.gzPeptides without bacterial contamination and redundancy. 22_transcriptomes_eukpep.tar.gzSRA in BioProject: PRJNA908224.Detailed explanation, read carefully before using these datasets:The scope of this study was to generate enough conserved phylogenomic markers to solve the species phylogeny of Apusomonadida and Ancyromonadida in the tree of eukaryotes (with the additional inclusion of the incertae sedis protist Meteora sporadica). For that, the original sets of de novo assembled transcripts from Spades (folder 01_transcripts_brut) were translated to proteins using TransDecoder and CD-HIT at 1% identity (folder 02_peptides_brut), and used to fill the phylogenomic dataset using BLASTp. As explained in the main text, they all 22 filled the dataset well (Table S1), and had high percentage of BUSCO completeness (Table S2); including higher than the reference apusomonad genome of Thecamonas trahens. We do not encourage the usage of this data brut sets unless all further analyses can be carefully checked in a case by case basis. Hence, with the aim to provide good quality data to the research community, we implemented a decontamination pipeline discussed below. From the original set of de novo assembled transcripts, CroCo detected most cross-contamination between the 1st sequencing batch (Table S3), which was also the one with more reads; > 10 million reads, compared to
创建时间:
2024-01-12



