Data from: Utility of pooled sequencing for association mapping in non-model organisms

DataONE2018-03-16 更新2024-06-25 收录

下载链接：

https://search.dataone.org/view/null

下载链接

链接失效反馈

官方服务：

资源简介：

High density genome-wide sequencing increases the likelihood of discovering genes of major effect and genomic structural variation in organisms. While there is an increasing availability of reference genomes across broad taxa, the greatest limitation to whole-genome sequencing of multiple individuals continues to be the costs associated with sequencing. To alleviate excessive costs, pooling multiple individuals with similar phenotypes and sequencing the homogenized DNA (Pool-Seq) can achieve high genome coverage, but at the loss of individual genotypes. Although Pool-Seq has been an effective method for association mapping in model organisms, it has not been frequently utilized in natural populations. To extend bioinformatic tools for rapid implementation of Pool-Seq data in non-model organisms, we developed a pipeline called PoolParty and illustrate its effectiveness in genetic association mapping. Alignment expectations based on five pooled Chinook salmon (Oncorhynchus tshawytscha) libraries showed that approximately 48% genome coverage per library could be achieved with reasonable sequencing effort. We additionally examined male and female O. tshawytscha libraries to illustrate how Pool-Seq techniques can successfully map known genes associated with functional differences among sexes such as growth hormone 2. Finally, we compared pools of individuals of different spawning ages for each sex to discover novel genes involved with age at maturity in O. tshawytscha such as opsin4 and transmembrane protein19. While not appropriate for every system, Pool-Seq data processed by the PoolParty pipeline is a practical method for identifying genes of major effect in non-model organisms when high genome coverage is necessary and cost is a limiting factor.

高密度全基因组测序可提升在生物体中发现主效基因（major effect gene）与基因组结构变异（genomic structural variation）的概率。尽管目前各类分类群的参考基因组资源日益丰富，但对多个个体开展全基因组测序的最大限制因素仍为测序相关成本。为缓解高额测序成本压力，可将表型相似的多个个体混合并对均质化DNA进行测序（Pool-Seq），该方法可获得高基因组覆盖度，但会丢失个体基因型信息。尽管Pool-Seq已成为模式生物遗传关联定位的有效手段，但其在自然种群中的应用仍较为有限。为拓展可快速处理非模式生物Pool-Seq数据的生物信息学工具，本研究开发了一款名为PoolParty的分析流程，并验证了其在遗传关联定位中的有效性。基于5个奇努克鲑（Oncorhynchus tshawytscha）混合文库的比对模拟结果显示，在合理的测序投入下，单个文库可获得约48%的基因组覆盖度。本研究还对奇努克鲑的雌雄个体混合文库进行了分析，以展示Pool-Seq技术可成功定位与性别功能差异相关的已知基因，例如生长激素2（growth hormone 2）。最后，本研究针对不同产卵年龄的雌雄个体混合文库开展比较分析，以期发现奇努克鲑成熟年龄相关的新基因，例如视蛋白4（opsin4）与跨膜蛋白19（transmembrane protein 19）。尽管该方法并非适用于所有研究体系，但当需要高基因组覆盖度且测序成本受限之时，经PoolParty流程处理的Pool-Seq数据，是在非模式生物中识别主效基因的实用手段。

创建时间：

2018-03-16

5,000+

优质数据集

54 个

任务类型

进入经典数据集