five

The primate Major Histocompatibility Complex: Sets of posterior trees from BEAST2 for the whole-class multi-gene alignments

收藏
DataCite Commons2026-01-29 更新2026-04-25 收录
下载链接:
https://datadryad.org/dataset/doi:10.5061/dryad.37pvmcvz7
下载链接
链接失效反馈
官方服务:
资源简介:
Gene families are groups of evolutionarily-related genes. One large gene family that has experienced rapid evolution lies within the Major Histocompatibility Complex (MHC), whose proteins serve critical roles in innate and adaptive immunity. Across the ~60 million year history of the primates, some MHC genes have turned over completely, some have changed function, some have converged in function, and others have remained essentially unchanged. Past work has typically focused on identifying MHC alleles within particular species or comparing gene content, but more work is needed to understand the overall evolution of the gene family across species. Thus, despite the immunologic importance of the MHC and its peculiar evolutionary history, we lack a complete picture of MHC evolution in the primates. We readdress this question using sequences from dozens of MHC genes and pseudogenes spanning the entire primate order, building a comprehensive set of gene and allele trees with modern methods. This dataset contains 7 sets of posterior trees which are outputs of BEAST2 (one each for Class I exon 2, Class I exon 3, Class I exon 4, Class IIA exon 2, Class IIA exon 3, Class IIB exon 2, and Class IIB exon 3). Each file is a .zip archive containing one file in NEXUS format. Each NEXUS file lists the alleles/sequences involved in the tree and then lists the trees (one for each state in the chain) in Newick format. We also included 3 summary trees for the genes in the Class I alpha-block (one each for exon 2, exon 3, and exon 4). These trees are also in NEXUS format. Overall, we find that the Class I gene subfamily is evolving much more quickly than the Class II gene subfamily, with the exception of the Class II MHC-DRB genes. We also pay special attention to the often-ignored pseudogenes, which we use to reconstruct different events in the evolution of the Class I region. This dataset can be used to explore the relationships between MHC genes within and between species. It could also be connected to other information, such as MHC diversity in different species or haplotype frequencies. All of the sequences that went into this dataset are publicly available, so there are no additional ethical or legal considerations for its use.
提供机构:
Dryad
创建时间:
2025-09-18
二维码
社区交流群
二维码
科研交流群
商业服务