Data from: Dissecting signal and noise in diatom chloroplast protein encoding genes with phylogenetic information profiling
收藏DataCite Commons2025-05-01 更新2025-04-09 收录
下载链接:
https://datadryad.org/dataset/doi:10.5061/dryad.610md
下载链接
链接失效反馈官方服务:
资源简介:
Previous analyses of single diatom chloroplast protein-encoded genes
recovered results highly incongruent with both traditional phylogenies and
phylogenies derived from the nuclear encoded small subunit (SSU) gene. Our
analysis here of six individual chloroplast genes (atpB, psaA, psaB, psbA,
psbC and rbcL) obtained similar anomalous results. However, phylogenetic
noise in these genes did not appear to be correlated, and their
concatenation appeared to effectively sum their collective signal. We
empirically demonstrated the value of combining phylogenetic information
profiling, partitioned Bremer support and entropy analysis in examining
the utility of various partitions in phylogenetic analysis. Noise was low
in the 1st and 2nd codon positions, but so was signal. Conversely, high
noise levels in the 3rd codon position was accompanied by high signal.
Perhaps counterintuitively, simple exclusion experiments demonstrated this
was especially true at deeper nodes where the 3rd codon position
contributed most to a result congruent with morphology and SSU (and the
total evidence tree here). Correlated with our empirical findings,
probability of correct signal (derived from information profiling)
increased and the statistical significance of substitutional saturation
decreased as data were aggregated. In this regard, the aggregated 3rd
codon position performed as well or better than more slowly evolving
sites. Simply put, direct methods of noise removal (elimination of
fast-evolving sites) disproportionately removed signal. Information
profiling and partitioned Bremer support suggest that addition of
chloroplast data will rapidly improve our understanding of the diatom
phylogeny, but conversely also illustrate that some parts of the diatom
tree are likely to remain recalcitrant to addition of molecular data. The
methods based on information profiling have been criticized for their
numerous assumptions and parameter estimates and the fact that they are
based on quartets of taxa. Our empirical results support theoretical
arguments that the simplifying assumptions made in these methods are
robust to “real-life” situations.
提供机构:
Dryad
创建时间:
2016-07-01



