Imputed WGS data of 626 birds
收藏Figshare2019-11-07 更新2026-04-29 收录
下载链接:
https://figshare.com/articles/dataset/Imputed_WGS_data_of_626_birds/10264913
下载链接
链接失效反馈官方服务:
资源简介:
Genotype imputation was performed with two-step approach from 55 to 600 K, and then imputed to WGS. Before Genotype imputation, pre-phasing was executed in Beagle 4.1 with default parameter (Browning and Browning, 2016). Firstly, using 450 birds with 600 K chip data as a reference panel, these 194 birds were imputed from 55 K to 600 K chip data using Beagle 4.0 with pedigree. And then merge 600 K chip data of these 194 birds and 450 birds using VCFtools. Secondly, all of 644 birds with 600 K chip data were imputed to WGS data using a combined reference panel Beagle 4.1 with default parameter. The combined reference panels included 24 key individuals from the yellow-feather dwarf broiler population and 311 birds with WGS data from diverse chicken breeds. More detail information could be found in our previous study (Ye et al., 2019). After performed genotype imputation, quality control of the imputed WGS data was conducted using PLINK (Purcell et al., 2007) with the criteria of SNP call rate > 95%, individual call rate > 97%, MAF > 0.5%, and Hardy-Weinberg equilibrium P-value > 1.0e-6. In addition, individuals would be excluded who existed Mendelian errors. Finally, the remaining 626 individuals and 11,173,020 SNPs were used for further analysis.
本研究采用两步法完成基因型填充(Genotype imputation):先将标记密度从55 K提升至600 K,随后进一步填充至全基因组测序(Whole Genome Sequencing, WGS)数据。在开展基因型填充前,先使用Beagle 4.1软件以默认参数完成预单倍型分型(pre-phasing)(Browning & Browning, 2016)。第一步,以450只具备600 K芯片数据的个体作为参考群体,使用整合家系信息的Beagle 4.0软件,将194只个体的基因型数据从55 K芯片密度填充至600 K芯片密度;随后利用VCFtools工具,将这194只个体与450只个体的600 K芯片数据进行合并。第二步,以合并后的参考群体为基础,使用Beagle 4.1软件的默认参数,将全部644只具备600 K芯片数据的个体的基因型数据填充至WGS数据。该合并参考群体包含24只黄羽矮肉鸡群体的核心个体,以及来自多个鸡品种的311只具备WGS数据的个体。更多细节可参见本团队前期研究(Ye et al., 2019)。完成基因型填充后,使用PLINK软件(Purcell et al., 2007)对填充得到的WGS数据进行质量控制,筛选标准为:单核苷酸多态性(Single Nucleotide Polymorphism, SNP)检出率>95%、个体检出率>97%、最小等位基因频率(Minor Allele Frequency, MAF)>0.5%、哈迪-温伯格平衡(Hardy-Weinberg equilibrium)P值>1.0×10⁻⁶;此外剔除存在孟德尔错误的个体。最终剩余626只个体与11,173,020个SNP位点用于后续分析。
创建时间:
2019-11-07



