five

Supporting data for "GADMA: Genetic algorithm for inferring demographic history of multiple populations from allele frequency spectrum data"

收藏
DataCite Commons2025-05-26 更新2025-04-15 收录
下载链接:
http://gigadb.org/dataset/100690
下载链接
链接失效反馈
官方服务:
资源简介:
The demographic history of any population is imprinted in the genomes of the individuals that make up the population. One of the most popular and convenient representations of genetic information is the allele frequency spectrum or AFS, the distribution of allele frequencies in populations. The joint allele frequency spectrum is commonly used to reconstruct the demographic history of multiple populations and several methods based on diffusion approximation (e.g., ai) and ordinary differential equations (e.g., moments) have been developed and applied for demographic inference. These methods provide an opportunity to simulate AFS under a variety of researcher-specified demographic models and to estimate the best model and associated parameters using likelihood-based local optimizations. However, there are no known algorithms to perform global searches of demographic models with a given AFS. Here, we introduce a new method that implements a global search using a genetic algorithm for the automatic and unsupervised inference of demographic history from joint allele frequency spectrum data. Our method is implemented in the software GADMA (Genetic Algorithm for Demographic Model Analysis, https://github.com/ctlab/GADMA). We demonstrate the performance of GADMA by applying it to sequence data from humans and non-model organisms and show that it is able to automatically infer a demographic model close to or even better than the one that was previously obtained manually. Moreover, GADMA is able to infer multiple demographic models at different local optima close to the global one, providing a larger set of possible scenarios to further explore demographic history.

任何种群的种群历史都镌刻在构成该种群的个体基因组之中。最受欢迎且便捷的遗传信息表征形式之一为等位基因频率谱(allele frequency spectrum, AFS),即种群中等位基因频率的分布情况。联合等位基因频率谱(joint allele frequency spectrum)常被用于重构多个种群的种群历史,目前已开发出基于扩散近似(diffusion approximation)的方法(如ai)以及基于常微分方程(ordinary differential equations)的方法(如moments),并应用于种群历史推断。这些方法可支持研究者在自定义的种群历史模型下模拟AFS,并通过基于似然的局部优化步骤,估算最优模型及其相关参数。然而,目前尚无已知算法可针对给定的AFS开展种群历史模型的全局搜索。本文提出一种全新方法,借助遗传算法实现全局搜索,可从联合等位基因频率谱数据中自动、无监督地推断种群历史。该方法已封装于GADMA(种群历史模型分析遗传算法,Genetic Algorithm for Demographic Model Analysis, https://github.com/ctlab/GADMA)软件之中。我们通过将GADMA应用于人类及非模式生物的测序数据验证了其性能,结果表明该方法可自动推断出与此前手动获得的模型相近甚至更优的种群历史模型。此外,GADMA还能在接近全局最优解的多个局部最优处推断出多种种群历史模型,为进一步探索种群历史提供了更丰富的潜在场景集合。
提供机构:
GigaScience Database
创建时间:
2020-01-08
二维码
社区交流群
二维码
科研交流群
商业服务