Data from: Distances and their visualization in studies of spatial-temporal genetic variation using single nucleotide polymorphisms (SNPs)
收藏DataCite Commons2025-06-01 更新2025-06-15 收录
下载链接:
https://datadryad.org/dataset/doi:10.5061/dryad.4b8gthtkn
下载链接
链接失效反馈官方服务:
资源简介:
Distance measures are widely used for examining genetic structure in
datasets that comprise many individuals scored for a very large number of
attributes. Genotype datasets composed of single nucleotide polymorphisms
(SNPs) typically contain bi-allelic scores for tens of thousands if not
hundreds of thousands of loci. We examine the application of distance
measures to SNP genotypes and sequence tag presence-absences (SilicoDArT)
and use real datasets and simulated data to illustrate pitfalls in the
application of genetic distances and their visualization. The datasets
used to illustrate points in the associated review are provided here
together with the R script used to analyse the data. Data are either
simulated internal to this script or are SNP data generated as part of
other studies and included as compressed binary files readily accessable
by reading into R using R base function readRDS(). Refer to the analysis
script for examples.
距离度量广泛用于分析包含大量个体的数据集的遗传结构,这些个体被评分的属性数量极多。由单核苷酸多态性(single nucleotide polymorphisms, SNPs)构成的基因型数据集通常包含数万个(甚至数十万个)位点的双等位基因评分。我们探讨了距离度量在SNP基因型和序列标签存在-缺失数据(SilicoDArT)中的应用,并利用真实数据集和模拟数据阐明了遗传距离应用及其可视化过程中的潜在问题。本文提供了相关综述中用于说明观点的数据集,以及用于分析数据的R脚本。数据要么是通过该脚本内部模拟生成,要么是其他研究中产生的SNP数据,以压缩二进制文件形式包含在内,可通过R基础函数readRDS()读取到R中,易于访问。示例请参见分析脚本。
提供机构:
Dryad
创建时间:
2024-01-16



