five

Supplementary Material for: Genotyping Error Detection in Samples of Unrelated Individuals without Replicate Genotyping

收藏
DataCite Commons2025-05-01 更新2024-07-25 收录
下载链接:
https://karger.figshare.com/articles/dataset/Supplementary_Material_for_Genotyping_Error_Detection_in_Samples_of_Unrelated_Individuals_without_Replicate_Genotyping/5120551/1
下载链接
链接失效反馈
官方服务:
资源简介:
<i>Objective:</i> Identifying genotyping errors is an important issue in genetic research, yet it has been relatively less studied in samples consisting of unrelated individuals. In this article, we consider several models of genotyping errors, which were originally proposed for pedigree data, for unrelated population samples with single nucleotide polymorphism (SNP) genotype data. The mathematical constraints are investigated for detecting genotyping errors without resampling replicates or genotyping relatives. <i>Methods:</i> For the various proposed genotyping error models, we unveil the conditions under which the parameters are identifiable. These results are verified through applications to simulated and real SNP data. <i>Results:</i> We show that, with constraints, two particular models provide both identifiable error rate and allele frequencies of an SNP for unrelated population data. The simulation study shows that these two models present unbiased estimates for the allele frequencies. One of the models also gives an unbiased estimate for the genotyping error rate. <i>Conclusion:</i> While the Hardy-Weinberg equilibrium test can be used to detect genotyping errors, a key advantage of these models is the explicit estimates of genotyping error rates and allele frequencies. This work may help researchers to estimate error rates and to use the estimates in their analysis to increase power and decrease bias, without the extra work of genotyping family members or replicates.

<i>研究目标:</i> 基因分型错误(genotyping error)识别是遗传研究中的重要议题,但针对由非相关个体组成的样本的相关研究相对匮乏。本文针对携带单核苷酸多态性(single nucleotide polymorphism, SNP)基因分型数据的非相关人群样本,采用若干原本为家系数据提出的基因分型错误模型展开研究。我们探究了无需重抽样重复样本或对亲属进行基因分型即可检测基因分型错误的数学约束条件。<i>研究方法:</i> 针对各类已提出的基因分型错误模型,我们揭示了参数可识别的具体条件。上述结论通过对模拟数据与真实SNP数据的应用得以验证。<i>研究结果:</i> 我们证实,在施加约束条件后,两款特定模型可针对非相关人群数据同时实现SNP的错误率与等位基因频率的可识别估计。模拟研究表明,这两款模型对等位基因频率的估计均为无偏估计;其中一款模型对基因分型错误率的估计同样为无偏估计。<i>研究结论:</i> 尽管哈迪-温伯格平衡(Hardy-Weinberg equilibrium)检验可用于检测基因分型错误,但本研究所提模型的核心优势在于可直接给出基因分型错误率与等位基因频率的明确估计值。本研究成果可帮助研究人员无需额外对家系成员进行基因分型或制备重复样本,即可估算错误率并将估计值应用于后续分析,从而提升统计效力并降低估计偏差。
提供机构:
Karger Publishers
创建时间:
2017-06-20
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作