Data from: ‘True’ null allele detection in microsatellite loci: a comparison of methods, assessment of difficulties, and survey of possible improvements

DataONE2014-09-04 更新2024-06-27 收录

下载链接：

https://search.dataone.org/view/null

下载链接

链接失效反馈

官方服务：

资源简介：

Null alleles are alleles that for various reasons fail to amplify in a PCR assay. The presence of null alleles in microsatellite data is known to bias the genetic parameter estimates. Thus, efficient detection of null alleles is crucial, but the methods available for indirect null allele detection return inconsistent results. Here, our aim was to compare different methods for null allele detection, to explain their respective performance and to provide improvements. We applied several approaches to identify the ‘true’ null alleles based on the predictions made by five different methods, used either individually or in combination. First, we introduced simulated ‘true’ null alleles into 240 population data sets and applied the methods to measure their success in detecting the simulated null alleles. The single best-performing method was ML-NullFreq_frequency. Furthermore, we applied different noise reduction approaches to improve the results. For instance, by combining the results of several methods, we obtained more reliable results than using a single one. Rule-based classification was applied to identify population properties linked to the false discovery rate. Rules obtained from the classifier described which population genetic estimates and loci characteristics were linked to the success of each method. We have shown that by simulating ‘true’ null alleles into a population data set, we may define a null allele frequency threshold, related to a desired true or false discovery rate. Moreover, using such simulated data sets, the expected null allele homozygote frequency may be estimated independently of the equilibrium state of the population.

无效等位基因（null alleles）指因各类原因无法在聚合酶链式反应（PCR）检测中完成扩增的等位基因。已知微卫星数据中无效等位基因的存在会对遗传参数估计结果产生偏倚。因此，高效检测无效等位基因至关重要，但现有间接无效等位基因检测方法的结果往往不一致。本研究旨在对比各类无效等位基因检测方法，阐明其各自的检测性能并提出优化方案。我们采用多种策略，基于五种不同方法单独或联合使用所得到的预测结果，以识别"真实"无效等位基因。首先，我们在240个种群数据集内引入模拟的"真实"无效等位基因，随后应用前述方法以衡量其检出模拟无效等位基因的效能。其中表现最优的单一方法为ML-NullFreq_frequency。此外，我们还采用了多种降噪策略以优化检测结果：例如，联合多种方法的检测结果可获得比单一方法更可靠的结论。我们还应用基于规则的分类方法，以识别与错误发现率相关的种群属性。从分类器中得到的规则阐明了哪些群体遗传估计值与位点特征和各方法的检测效能相关。本研究证实，通过在种群数据集内模拟"真实"无效等位基因，可确定与期望的真阳性或假阳性发现率相关的无效等位基因频率阈值。此外，利用此类模拟数据集，无需依赖种群平衡状态即可独立估算无效等位基因纯合子频率。

创建时间：

2014-09-04