Resolving population structure and genetic differentiation associated with RAD-SNP loci under selection in jute (Corchorus olitorius L.)
收藏DataCite Commons2025-04-01 更新2024-07-27 收录
下载链接:
https://figshare.com/articles/Resolving_population_structure_and_genetic_differentiation_associated_with_RAD-SNP_loci_under_selection_in_non-model_jute_Corchorus_olitorius_L_/6339518/4
下载链接
链接失效反馈官方服务:
资源简介:
We performed individual-based restriction site-associated DNA (RAD) sequencing to discover single-nucleotide polymorphisms (SNPs) across a diverse set of 225 fibre-type lines in jute (<i>Corchorus olitorius </i>L., Malvaceae s. l.) <i></i>and identified a set of 1115 polymorphic RAD-SNP markers (SNP containing RADseq loci), each supporting a single SNP with >0.05 minor allele frequency (MAF). Based on multilocus RAD-SNP genotypes of 221 lines (four lines were excluded due to >50 % missing genotypes) with a call rate of >0.95, we examined the geographic patterns of genetic diversity across nine predefined populations, viz. AFR1 (Kenya and Sudan), AFR2 (Tanzania), CI (central India), EI (east India), NI (north India), SI (south India), NPPK (Nepal and Pakistan), ESEA (China, Myanmar, Indonesia and Thailand) and RoW (Australia, Brazil, Germany and Russia) and determined their genetic relatedness by assessing the <i>F</i><sub>ST</sub>, AMOVA, <i>N</i><sub>m </sub>and PCA at the population level. Using five assignment tests with different statistical bases (<i>k</i>-means clustering, STRUCTURE, sNMF, DAPC and frequency-based assignment test), the most exhaustive yet to our knowledge, we inferred how these geographic populations are structured. We further applied subpopulation-based <i>F</i><sub>ST</sub>- (LOSITAN and BayeScan) and <i>G</i><sub>ST</sub>-outlier (HacDivSel) tests and an individual-based global approach (PCAdapt) based on principal component analysis (PCA) to detect putative RAD-SNP loci under selection. Instead of BLAST alone, we employed a serial approach based on BLAST, Blast2GO mapping, protein domain annotation (DoMosaics) and REViGO semantic analysis to identify candidate genes and retrieve the overrepresented gene ontology (GO) terms associated with the outlier RAD-SNP loci putatively involved in local adaptation. <br>
本研究以黄麻(长蒴黄麻*Corchorus olitorius* L.,广义锦葵科)的225份纤维型品系为研究材料,采用基于个体的限制性酶切位点相关DNA(restriction site-associated DNA, RAD)测序技术,挖掘全基因组范围内的单核苷酸多态性(single-nucleotide polymorphisms, SNPs),最终筛选得到1115个多态性RAD-SNP标记(即包含单核苷酸多态性的RAD测序位点),每个标记均对应1个次要等位基因频率(minor allele frequency, MAF)>0.05的单核苷酸变异位点。
本研究基于221份品系的多位点RAD-SNP基因型(因基因型缺失率>50%,剔除4份品系),且分型成功率>0.95,对9个预先定义的地理群体的遗传多样性空间分布模式进行了分析,这9个群体分别为:AFR1(肯尼亚、苏丹)、AFR2(坦桑尼亚)、CI(印度中部)、EI(印度东部)、NI(印度北部)、SI(印度南部)、NPPK(尼泊尔、巴基斯坦)、ESEA(中国、缅甸、印度尼西亚、泰国)以及RoW(澳大利亚、巴西、德国、俄罗斯);并通过群体水平的固定指数(F<sub>ST</sub>)、分子方差分析(Analysis of Molecular Variance, AMOVA)、基因流(N<sub>m</sub>)以及主成分分析(Principal Component Analysis, PCA)评估了这些群体的遗传亲缘关系。
本研究采用5种基于不同统计原理的群体归属分析方法(k均值聚类(k-means clustering)、STRUCTURE软件、sNMF、判别性主成分分析(Discriminant Analysis of Principal Components, DAPC)以及基于频率的归属检验),这是据目前所知最为全面的分析方案,以此推断上述地理群体的遗传结构。
进一步,本研究分别采用基于亚群体的F<sub>ST</sub>离群值检验(LOSITAN和BayeScan)、G<sub>ST</sub>离群值检验(HacDivSel),以及基于主成分分析的个体水平全局检测方法(PCAdapt),以筛选潜在受选择的RAD-SNP位点。
相较于单一使用局部同源比对搜索工具(Basic Local Alignment Search Tool, BLAST),本研究采用了一套整合式分析流程,依次通过BLAST比对、Blast2GO功能注释、蛋白质结构域注释(DoMosaics)以及REViGO语义富集分析,筛选出与潜在受选择的离群RAD-SNP位点相关的候选基因,并获取其显著富集的基因本体(Gene Ontology, GO)术语,以解析这些位点可能参与的局部适应性机制。
提供机构:
figshare
创建时间:
2018-06-01



