Datasets_S1_to_S15_PhyloToL
收藏Figshare2025-01-17 更新2026-04-08 收录
下载链接:
https://figshare.com/articles/dataset/Datasets_S1_to_S15_PhyloToL/26540599/1
下载链接
链接失效反馈官方服务:
资源简介:
<b>Datasets S1 to S15 included within the PhyloToL analysis. (see paper for more details, including full legends)</b><b>Dataset S1: </b>A record of every taxon and the corresponding sequence data used in the study.<b>Dataset S2: </b>A summary of taxon code prefixes corresponding to “major” (first two characters) and “minor” (first 5 characters) clades, along with the number of species (out of 1000 total) in the study falling in each minor clade.<b>Dataset S3: </b>A summary of the number of species included in the study per “major” clade, and the number of whole genome assemblies vs. whole transcriptome assemblies used available for major clade.<b>Dataset S4: </b>The file that we input to the ‘contamination loop’ of PhyloToL part two that defines rules for removing putative contaminant sequences based on sister relationships.<b>Dataset S5: </b>The file that we input to the ‘contamination loop’ of PhyloToL part two that defines rules for removing putative contaminant sequences based on ‘subsister’ relationships, where sequence A’s subsister is defined as the sister of A’s parent node.<b>Dataset S6: </b>The rules for clade-based contamination removal of ciliate clades, primarily to mitigate contamination by parabasalids.<b>Dataset S7: </b>The rules for general clade-based contamination removal.<b>Dataset S8: </b>A rules file for a alternative round of clade-grabbing designed specifically to account for clades of photosynthetic taxa containing species not otherwise expected to appear monophyletically<b>Dataset S9: </b>A description of all “utility” scripts supplied on the GitHub (https://github.com/Katzlab/PhyloToL-6).<b>Dataset S10:</b> Descriptive statistics of the OGs in the Hook Database, used as a reference for OG assignment in PhyloToL 6 part 1.<b>Dataset S11: </b>A summary of the GO terms identified for each OG using EggNOG. See methods.<br><b>Dataset S12 : </b>A summary of the performance of a set of exemplar runs of PhyloToL part 1. See results.<br><b>Dataset S13: </b>A description of the taxa containing each of the 500 OGs used in this study at each stage of curation.<br><b>Dataset S14: </b>A description of the ‘missing data’ at each stage in the contamination removal process for each taxon.<b>Dataset S15: </b>A summary of all of the taxa included in the Hook Database, as seeded by data from OrthoMCL version 6.13.
提供机构:
Cote-L’Heureux, Auden; Gawron, Rebecca; Leleu, Marie; Katz, Laura; Ani, Godwin
创建时间:
2024-08-13



