five

Data from: Distances and their visualization in studies of spatial-temporal genetic variation using single nucleotide polymorphisms (SNPs)

收藏
DataCite Commons2025-06-01 更新2025-06-15 收录
下载链接:
https://datadryad.org/dataset/doi:10.5061/dryad.4b8gthtkn
下载链接
链接失效反馈
官方服务:
资源简介:
Distance measures are widely used for examining genetic structure in datasets that comprise many individuals scored for a very large number of attributes. Genotype datasets composed of single nucleotide polymorphisms (SNPs) typically contain bi-allelic scores for tens of thousands if not hundreds of thousands of loci. We examine the application of distance measures to SNP genotypes and sequence tag presence-absences (SilicoDArT) and use real datasets and simulated data to illustrate pitfalls in the application of genetic distances and their visualization. The datasets used to illustrate points in the associated review are provided here together with the R script used to analyse the data. Data are either simulated internal to this script or are SNP data generated as part of other studies and included as compressed binary files readily accessable by reading into R using R base function readRDS(). Refer to the analysis script for examples.

距离度量广泛用于分析包含大量个体的数据集的遗传结构,这些个体被评分的属性数量极多。由单核苷酸多态性(single nucleotide polymorphisms, SNPs)构成的基因型数据集通常包含数万个(甚至数十万个)位点的双等位基因评分。我们探讨了距离度量在SNP基因型和序列标签存在-缺失数据(SilicoDArT)中的应用,并利用真实数据集和模拟数据阐明了遗传距离应用及其可视化过程中的潜在问题。本文提供了相关综述中用于说明观点的数据集,以及用于分析数据的R脚本。数据要么是通过该脚本内部模拟生成,要么是其他研究中产生的SNP数据,以压缩二进制文件形式包含在内,可通过R基础函数readRDS()读取到R中,易于访问。示例请参见分析脚本。
提供机构:
Dryad
创建时间:
2024-01-16
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作