Mitochondrial DNA for phylogeny building: Assessing individual and grouped mtGenes as proxies for the mtGenome in Platyrrhines
收藏NIAID Data Ecosystem2026-05-02 收录
下载链接:
http://datadryad.org/dataset/doi%253A10.5061%252Fdryad.q2bvq83w8
下载链接
链接失效反馈官方服务:
资源简介:
Phylogenetic trees are analytic tools used in primate studies to elucidate evolutionary relationships. Because of its relative ease to sequence and rapid evolution compared to nuclear genomes, mitochondrial DNA is frequently used for phylogeny building. This project evaluated the effectiveness of using individual or grouped mitochondrial genes (mtGenes) as a proxy for the mitochondrial genome (mtGenome) in phylogeny building within two nested primate datasets, Cebidae and Platyrrhini, with differing divergence dates. MtGene utility rankings were determined based on congruence values to the mtGenome tree. MtGenes trees were also assessed on tree resolution and ability to sort nested clades. We found that most individual mtGenes, including ribosomal genes (12S and 16S), COX genes, most ND genes, and D-Loop are not appropriate for use as proxies for the mtGenome when tree building in either the Cebidae or Platyrrhini set. On average, grouped mtGenes outperformed individual mtGenes in both sets, and mtGene and grouped mtGene rankings varied between sets. Pairing CYB and COX3 together or pairing ND2 and CYB worked well in both the Cebidae set and Platyrrhini set. We also found that nucleotide diversity is not a predictor of mtGene performance. Instead, it may be that unique mtGene or mtGene system evolutionary history impacts mtGene performance.
Methods
This dataset includes four main categories of data: the nexus files, the Visual TreeCmp scores, the mtGenome trees for both Platyrrhini and Cebidae, and the sample info.
(1) The nexus files were generated through extraction from whole mtGenome data, some newly published in this study and available on GenBank (PP454502-PP454561). Extractions were either present in the mtGenome as annotated from previous authors, or were predicted using Geneious Prime's v. 2023.1.2 (Biomatters Ltd.) prediction and annotation feature using a reference from a close-relative annotated genome: Sapajus xanthosternos (Accession no. KC757410) for Sapajus samples, and Cebus albifrons (Accession no. AJ309866) for Cebus samples.Entire mtGenomes, rRNA genes (n = 2), and D-Loop were aligned using the Clustal Omega v.1.2.2 (Sievers et al., 2011) alignment feature. Protein coding genes (n = 13) were aligned using the Muscle v.5.1 (Edgar, 2022) multiple alignment feature. All gene alignments were additionally checked by eye and re-aligned if necessary. The format of the data includes whether the mtGene identified followed by whether the alignment comes from the Platyrrhini set or the Cebidae set.
(2) The Visual Tree CMP scores were generated using trees created from each of the alignments into the Visual TreeCmp online tool (https://eti.pg.edu.pl/TreeCmp/). Thus, each mtGene shows scores for all rooted metrics for both the Cebidae set and the Platyrrhini set. Additionally, some analyses have different posterior-probability thresholds (PPT) sets with corresponding metric data, where the PPT indicates the minimum posterior probability to collapse a clade into a soft polytomy.
(3) The mtGenome trees were the reference trees used to evaluate the scores of the individual mtGenes.
(4) Sample info is simply a collection of information about the samples that we used in each of the datasets.
创建时间:
2025-03-13



