five

Data from: Evolutionary innovation through fusion of sequences from across the tree of life

收藏
DataCite Commons2026-05-05 更新2026-04-25 收录
下载链接:
https://datadryad.org/dataset/doi:10.5061/dryad.t1g1jwtdz
下载链接
链接失效反馈
官方服务:
资源简介:
We hypothesized that fusion of genes acquired via horizontal gene transfer (HGT) with endogenous sequences in arthropod genomes might generate what we call “HGT-chimeras”: genes with regions of non-metazoan and metazoan descent in the same open reading frame. This dataset supports the study of these HGT-chimeras presented in our manuscript “Evolutionary innovation through fusion of sequences from across the tree of life”. It includes input data and intermediate output files used in our HGT-chimera detection pipeline, as well as in the downstream bioinformatic characterization of these genes. The repository contains FASTA files of protein sequences, clustering results, phylogenetic trees, and tabular summaries of inferred HGT-chimeras, along with downstream analyses describing sequence molecular evolution (dN/dS), phylogenetic origin, gene expression, and domain architecture. Files are organized to correspond with steps in the associated GitHub pipeline, beginning with input clustering data (mmseq_cluster_representatives_with_missing.fasta) and concluding with analyses of representative HGT-chimeras highlighted in the manuscript’s figures. These data can be reused to validate our findings, extend analyses of discovered HGT-chimeras, or adapt the included pipeline for other genomic datasets. No ethical or legal restrictions apply to the data, which are derived from available genome assemblies and annotation data on NCBI.
提供机构:
Dryad
创建时间:
2025-10-27
二维码
社区交流群
二维码
科研交流群
商业服务