Data from: Seven common mistakes in population genetics and how to avoid them

DataONE2015-05-14 更新2024-06-27 收录

下载链接：

https://search.dataone.org/view/null

下载链接

链接失效反馈

官方服务：

资源简介：

Since the data resulting from modern genotyping tools are astoundingly complex, genotyping studies require great care in the sampling design, genotyping, data analysis and interpretation. Such care is necessary because, with datasets containing thousands of loci, small biases can easily become strongly significant patterns. Such biases may already be present in routine tasks that are present in almost every genotyping study. Here, I discuss seven common mistakes that can be frequently encountered in the genotyping literature: (i) giving more attention to genotyping than to sampling; (ii) failing to perform or report experimental randomisation in the lab; (iii) equating geopolitical borders with biological borders; (iv) testing significance of clustering output; (v) misinterpreting Mantel’s r statistic; (vi) only interpreting a single value of k; (vii) forgetting that only a small portion of the genome will be associated with climate. For every of those issues, I give some suggestions how to avoid these mistakes. Overall, I argue that genotyping studies would benefit from establishing a more rigorous experimental design, involving proper sampling design, randomisation and better distinction of a priori hypotheses and exploratory analyses.

鉴于现代基因分型（genotyping）工具所产生的数据极为复杂，基因分型研究在采样设计、基因分型操作、数据分析与结果解读等环节均需格外审慎。此类审慎实属必要，因当数据集包含数千个基因座（loci）时，微小的偏差极易演变为极具统计学显著性的模式。这类偏差甚至可能已存在于几乎所有基因分型研究均会涉及的常规操作流程中。本文将探讨基因分型研究文献中常见的七类典型错误：(i) 过度关注基因分型操作却忽视采样设计；(ii) 未在实验室实施或报告实验随机化流程；(iii) 将地理政治边界等同于生物边界；(iv) 对聚类输出结果进行显著性检验；(v) 误读曼特尔（Mantel）r统计量；(vi) 仅对单一k值进行解读；(vii) 忽略仅基因组的极小部分会与气候相关联。针对上述每一类问题，本文均给出了规避此类错误的具体建议。总体而言，本文认为，若能建立更为严谨的实验设计——涵盖合理的采样方案、标准化的随机化流程，以及更好地区分先验假设与探索性分析——基因分型研究将获得显著的学术增益。

创建时间：

2015-05-14

5,000+

优质数据集

54 个

任务类型

进入经典数据集