Supplementary data from: Integrating secondary structure information enhances phylogenetic signal in mitochondrial protein coding genes
收藏DataCite Commons2026-03-19 更新2026-04-25 收录
下载链接:
https://datadryad.org/dataset/doi:10.5061/dryad.wh70rxx29
下载链接
链接失效反馈官方服务:
资源简介:
Accurate phylogenetic inference requires models that account for
heterogeneity in molecular evolution. Mitochondrial protein-coding genes,
which encode membrane-bound proteins composed of multiple transmembrane
α-helices, exhibit considerable compositional and functional variation
across structural regions, variation that is often overlooked in standard
partitioning strategies. Here, we introduce TRAMPO (TRAnsMembrane Protein
Order), a novel pipeline that incorporates predicted secondary structural
features (i.e., matrix-facing, transmembrane, and intermembrane-facing
domains) into phylogenetic partitioning schemes. We applied TRAMPO to
seven mitochondrial datasets, spanning crustaceans, hexapods, and
vertebrates, and evaluated eight partitioning strategies based on
combinations of codon position, strand, and secondary structure.
Transmembrane helices showed pronounced thymine enrichment at second codon
positions and hydrophobic amino-acid composition, reflecting
domain-specific evolutionary constraints. To assess whether these
structural patterns influence phylogenetic reconstruction, we performed
maximum likelihood analyses under Markov models with various degrees of
complexity (ranging from standard Markov models, via Lie Markov and
General Heterogeneous evolution on a Single Topology Markov models, to
profile mixture Markov models). We also evaluated different models of
rate-heterogeneity across sites (including the invariable sites model,
gamma-distribution model, and FreeRate model) to examine their interaction
with partitioning strategies and overall model performance. Incorporating
structural information into partitioning schemes consistently improved
model fit and reduced apparent heterogeneity, as reflected in lower AIC
values and more compositionally homogeneous partitions. These improvements
translated into more consistent and topologically congruent phylogenetic
trees across most datasets, while also reducing computational time.
Notably, second codon positions in DNA that encode transmembrane helices
were consistently retained as distinct partitions during model
optimization, even in Mammals and Vertebrates, where secondary structure
contributed little to overall model performance, underscoring their strong
and conserved evolutionary signal. Surveys of tree space using quartet
distances further supported these findings, with structurally informed
models yielding more tightly clustered and internally consistent tree
topologies. The benefits of structural partitioning were most pronounced
in lineages of intermediate evolutionary depth and declined in ancient
vertebrate and mammalian clades, where substitutional saturation
accumulates with evolutionary time and strand asymmetry tends to emerge
more frequently. In some cases, models with the lowest AIC did not yield
the most congruent topologies, underscoring the limitations of information
criteria when comparing models of different complexity. Overall, our
findings demonstrate that secondary structural features, particularly the
repetitive architecture of transmembrane helices, harbour meaningful
phylogenetic signal. Incorporating this information into partitioning
schemes improves tree reconstruction and mitigates underlying
heterogeneity. TRAMPO provides a scalable, open-source tool to implement
this approach in mitochondrial phylogenetics.
提供机构:
Dryad
创建时间:
2026-03-19



