Table1_Extensive set of African ancestry-informative markers (AIMs) to study ancestry and population health.xlsx
收藏NIAID Data Ecosystem2026-03-14 收录
下载链接:
https://figshare.com/articles/dataset/Table1_Extensive_set_of_African_ancestry-informative_markers_AIMs_to_study_ancestry_and_population_health_xlsx/22149053
下载链接
链接失效反馈官方服务:
资源简介:
Introduction: Human populations are often highly structured due to differences in genetic ancestry among groups, posing difficulties in associating genes with diseases. Ancestry-informative markers (AIMs) aid in the detection of population stratification and provide an alternative approach to map population-specific alleles to disease. Here, we identify and characterize a novel set of African AIMs that separate populations of African ancestry from other global populations including those of European ancestry.
Methods: Using data from the 1000 Genomes Project, highly informative SNP markers from five African subpopulations were selected based on estimates of informativeness (In) and compared against the European population to generate a final set of 46,737 African ancestry-informative markers (AIMs). The AIMs identified were validated using an independent set and functionally annotated using tools like SIFT, PolyPhen. They were also investigated for representation of commonly used SNP arrays.
Results: This set of African AIMs effectively separates populations of African ancestry from other global populations and further identifies substructure between populations of African ancestry. When a subset of these AIMs was studied in an independent dataset, they differentiated people who self-identify as African American or Black from those who identify their ancestry as primarily European. Most of the AIMs were found to be in their intergenic and intronic regions with only 0.6% in the coding regions of the genome. Most of the commonly used SNP array investigated contained less than 10% of the AIMs.
Discussion: While several functional annotations of both coding and non-coding African AIMs are supported by the literature and linked these high-frequency African alleles to diseases in African populations, more effort is needed to map genes to diseases in these genetically diverse subpopulations. The relative dearth of these African AIMs on current genotyping platforms (the array with the highest fraction, llumina’s Omni 5, harbors less than a quarter of AIMs), further demonstrates a greater need to better represent historically understudied populations.
引言:由于不同群体间遗传祖先的差异,人类群体往往具有高度的群体结构,这给基因与疾病的关联研究带来了挑战。祖先信息标记(Ancestry-informative markers, AIMs)有助于检测群体分层,并为将群体特异性等位基因与疾病关联提供了一种替代方案。本研究鉴定并表征了一套新型非洲祖先信息标记,可将非洲血统人群与包括欧洲血统人群在内的其他全球人群区分开来。
方法:基于1000基因组计划(1000 Genomes Project)的数据,本研究根据信息性指数(In)的评估结果,从5个非洲亚群中筛选出高信息性单核苷酸多态性(Single Nucleotide Polymorphism, SNP)标记,并与欧洲人群进行比对,最终得到包含46737个非洲祖先信息标记的数据集。本研究使用独立数据集对筛选得到的AIMs进行了验证,并通过SIFT、PolyPhen等工具对其进行了功能注释,同时调研了这些AIMs在常用SNP芯片中的覆盖情况。
结果:这套非洲AIMs可有效区分非洲血统人群与其他全球人群,还能进一步识别非洲血统人群内部的亚群体结构。在独立数据集上对该套AIMs的子集进行分析时,其可将自我认定为非裔美国人或黑人的人群与主要祖先为欧洲血统的人群区分开来。绝大多数AIMs位于基因组的基因间区和内含子区域,仅有0.6%的标记位于基因组编码区。本次调研的常用SNP芯片中,大多数仅能覆盖不到10%的AIMs。
讨论:尽管已有文献支持编码区和非编码区非洲AIMs的多项功能注释,并将这些高频非洲等位基因与非洲人群的疾病关联起来,但在这些遗传多样性丰富的亚群中,进一步开展基因与疾病的关联研究仍需付出更多努力。当前基因分型平台中这类非洲AIMs的相对匮乏(覆盖比例最高的芯片为Illumina Omni 5,仅能覆盖不到四分之一的AIMs),进一步凸显了对历史上研究不足的人群进行更充分表征的迫切需求。
创建时间:
2023-02-23



