five

Data from "Removing recombinant loci has minimal impact on species tree topologies estimated from empirical data"

收藏
DataCite Commons2024-06-24 更新2024-08-19 收录
下载链接:
https://figshare.com/articles/dataset/Data_from_Removing_recombinant_loci_has_minimal_impact_on_species_tree_topologies_estimated_from_empirical_data_/26087437
下载链接
链接失效反馈
官方服务:
资源简介:
Data from "Removing recombinant loci has minimal impact on species tree topologies estimated from empirical data"<i>Figshare repository documentation</i><i>Caitlin Cherryh 2024</i>An underlying assumption in phylogenetics is that each site in a loci shares an identical evolutionary history that fits a single bifurcating tree. However, this assumption is broken by biological processes such as introgression or recombination. We selected four empirical datasets and investigated whether removing loci identified as putatively recombinant impacted species tree topology. To do so, we selected three tests for recombination detection (PHI, MaxChi, and GeneConv). We applied each test to each loci in each dataset. Then we used the results to break the loci into subsets. For each test, the set of loci was broken into a subset of loci that passed that test and a subset of loci that failed each test (i.e. loci that were identified as putatively recombinant). We then estimated species trees from each subset with both summary coalescent (ASTRAL-III) and maximum likelihood (IQ-Tree2) tree estimation methods. Finally, we compared the goodness of fit and topology of each tree.Replicating our analysesThe caitlinch/gene_filtering GitHub repository contains all R scripts necessary to repeat these analyses: https://github.com/caitlinch/gene_filtering.See the manuscript for detailed methods.Software programsTrees were estimated in IQ-Tree2 (http://www.iqtree.org/), ASTRAL (https://github.com/smirarab/ASTRAL), and RAxML-ng (https://github.com/amkozlov/raxml-ng)The recombination tests applied are available in the programs PHIPack (https://www.maths.otago.ac.nz/~dbryant/software.html) and GeneConv (https://www.math.wustl.edu/~sawyer/geneconv/)Tree adequacy tests were performed using the AU test (implemented in IQ-Tree2) and the QuartetNetwork Goodness of Fit test (https://github.com/cecileane/QuartetNetworkGoodnessFit.jl)Further details on software programs are available in the manuscript or the GitHub repository for this project (https://github.com/caitlinch/gene_filtering)Data<b>empirical_datasets.pdf</b>Documentation of the 4 empirical alignments analysed in this study, including original manuscript and record of where each matrix was obtained.<b>datasets/</b>One directory per dataset, containing the loci alignments used in our analysis<b>1KP/1KP_alignments-FAA-masked_genes_renamed.zip</b>: Loci alignments used for this analysis<b>1KP/1KP_annotations.csv</b>: CSV file from Leebens-Mack et al. (2019), outlining clades and classification for each taxon<b>Pease2016/Pease2016_all_window_alignments</b>: Loci alignments used for this analysis. Generation of window alignments is described in methods of manuscript.<b>Vanderpool2020/</b><b>Vanderpool2020_1730_Alignments_FINAL.zip</b>: Loci alignments used for this analysis<b>Whelan2017_genes.zip</b>: Loci alignments used for this analysis<b>trees/</b>All maximum likelihood (estimated in IQ-Tree) and summary (estimated in ASTRAL) trees from our analysis<b>qcf/</b>All quartet concordance factor results. One directory per dataset.<b>files/</b><b>00_1KP_loci_models_noFreeRates.csv</b>: Model estimation for estimating maximum likelihood trees from the 1KP dataset. Details in manuscript.<b>01_AllDatasets_IQ-Tree_warnings_LociToExclude.csv</b>: List of loci to exclude from tree estimation, based on errors raised in IQ-Tree.<b>01_AllDatasets_RecombinationDetection_complete_collated_results.csv</b>: Results from applying the recombination tests to each gene<b>02_AllDatasets_RecombinationDetection_PassFail_record.csv</b>: Record of whether individual loci passed or failed the three tests for recombination.<b>02_species_tree_summary_numbers.csv</b>: Summary of species tree estimation process. Lists number of loci in each dataset that passed or failed each test for recombination.<b>03_AllDatasets_collated_ComparisonTrees_AU_test_results.csv</b>: AU test results for maximum likelihood trees.<b>03_AllDatasets_collated_ComparisonTrees_QuarNetGoF_test_results.csv</b>: Quartet Goodness of Fit test results for summary trees.<b>03_AllDatasets_collated_RF_wRF_distances_results.csv</b>: RF and wRF distances between trees.<b>04_BranchSupport_values.csv</b>: Branch support values (ultrafast bootstrap or local posterior probability).<b>04_qCF_values.csv</b>: quartet concordance factor results.<br>
提供机构:
figshare
创建时间:
2024-06-24
二维码
社区交流群
二维码
科研交流群
商业服务