Data from: Evolutionary innovation through fusion of sequences from across the tree of life
收藏DataCite Commons2026-05-05 更新2026-04-25 收录
下载链接:
https://datadryad.org/dataset/doi:10.5061/dryad.t1g1jwtdz
下载链接
链接失效反馈官方服务:
资源简介:
We hypothesized that fusion of genes acquired via horizontal gene transfer
(HGT) with endogenous sequences in arthropod genomes might generate what
we call “HGT-chimeras”: genes with regions of non-metazoan and metazoan
descent in the same open reading frame. This dataset supports the study of
these HGT-chimeras presented in our manuscript “Evolutionary innovation
through fusion of sequences from across the tree of life”. It includes
input data and intermediate output files used in our HGT-chimera detection
pipeline, as well as in the downstream bioinformatic characterization of
these genes. The repository contains FASTA files of protein sequences,
clustering results, phylogenetic trees, and tabular summaries of inferred
HGT-chimeras, along with downstream analyses describing sequence molecular
evolution (dN/dS), phylogenetic origin, gene expression, and domain
architecture. Files are organized to correspond with steps in the
associated GitHub pipeline, beginning with input clustering data
(mmseq_cluster_representatives_with_missing.fasta) and concluding with
analyses of representative HGT-chimeras highlighted in the manuscript’s
figures. These data can be reused to validate our findings, extend
analyses of discovered HGT-chimeras, or adapt the included pipeline for
other genomic datasets. No ethical or legal restrictions apply to the
data, which are derived from available genome assemblies and annotation
data on NCBI.
提供机构:
Dryad
创建时间:
2025-10-27



