five

DataSheet_1_A Detailed View of KIR Haplotype Structures and Gene Families as Provided by a New Motif-Based Multiple Sequence Alignment.zip

收藏
NIAID Data Ecosystem2026-03-12 收录
下载链接:
https://figshare.com/articles/dataset/DataSheet_1_A_Detailed_View_of_KIR_Haplotype_Structures_and_Gene_Families_as_Provided_by_a_New_Motif-Based_Multiple_Sequence_Alignment_zip/14893011
下载链接
链接失效反馈
官方服务:
资源简介:
Human chromosome 19q13.4 contains genes encoding killer-cell immunoglobulin-like receptors (KIR). Reported haplotype lengths range from 67 to 269 kb and contain 4 to 18 genes. The region has certain properties such as single nucleotide variation, structural variation, homology, and repetitive elements that make it hard to align accurately beyond single gene alleles. To the best of our knowledge, a multiple sequence alignment of KIR haplotypes has never been published or presented. Such an alignment would be useful to precisely define KIR haplotypes and loci, provide context for assigning alleles (especially fusion alleles) to genes, infer evolutionary history, impute alleles, interpret and predict co-expression, and generate markers. In order to extend the framework of KIR haplotype sequences in the human genome reference, 27 new sequences were generated including 24 haplotypes from 12 individuals of African American ancestry that were selected for genotypic diversity and novelty to the reference, to bring the total to 68 full length genomic KIR haplotype sequences. We leveraged these data and tools from our long-read KIR haplotype assembly algorithm to define and align KIR haplotypes at <5 kb resolution on average. We then used a standard alignment algorithm to refine that alignment down to single base resolution. This processing demonstrated that the high-level alignment recapitulates human-curated annotation of the human haplotypes as well as a chimpanzee haplotype. Further, assignments and alignments of gene alleles were consistent with their human curation in haplotype and allele databases. These results define KIR haplotypes as 14 loci containing 9 genes. The multiple sequence alignments have been applied in two software packages as probes to capture and annotate KIR haplotypes and as markers to genotype KIR from WGS.

人类19号染色体长臂1区3带4亚带(19q13.4)区域携带有编码杀伤细胞免疫球蛋白样受体(killer-cell immunoglobulin-like receptors,KIR)的基因。已报道的该区域单倍型长度介于67 kb至269 kb之间,包含4至18个基因。该区域存在单核苷酸变异、结构变异、同源性及重复元件等特征,导致其难以在单基因等位基因之外实现精准序列比对。据我们所知,目前尚未有公开报道或展示过KIR单倍型的多序列比对结果。此类多序列比对可用于精准界定KIR单倍型及其基因座,为等位基因(尤其是融合等位基因)的基因分配提供参照依据,辅助推断进化历史、开展等位基因填充、阐释并预测共表达模式,以及开发分子标记。为扩充人类基因组参考序列中KIR单倍型序列的框架体系,本研究新增了27条KIR单倍型序列,其中包含来自12名非裔美国人个体的24条单倍型——这些个体均经筛选,其基因型具有多样性且相对于参考序列存在新颖性——最终使全长基因组KIR单倍型序列总数达到68条。我们依托这些数据以及自主开发的长读长KIR单倍型组装算法工具,对KIR单倍型进行界定与比对,平均分辨率可达5 kb以下。随后我们采用标准序列比对算法将该比对结果优化至单碱基分辨率水平。该处理流程证实,我们得到的高层次比对结果可重现人类单倍型以及黑猩猩单倍型的人工注释结果。此外,基因等位基因的分配与比对结果,与单倍型及等位基因数据库中的人工注释结果保持一致。本研究结果将KIR单倍型界定为包含9个基因的14个基因座。本研究得到的多序列比对结果已被集成至两款软件工具中,既可作为探针用于捕获并注释KIR单倍型,也可作为分子标记用于基于全基因组测序(Whole Genome Sequencing,WGS)数据的KIR基因分型。
创建时间:
2021-07-01
二维码
社区交流群
二维码
科研交流群
商业服务