Range-wide snowshoe hare exome SNP genotypes
收藏DataCite Commons2025-06-01 更新2024-07-28 收录
下载链接:
https://figshare.com/articles/Range-wide_snowshoe_hare_exome_SNP_genotypes/12030663/1
下载链接
链接失效反馈官方服务:
资源简介:
A .ped file containing 10388 unlinked SNP genotypes for snowshoe hares derived from a whole exome capture. SNPs were used to perform a population structure analysis in Admixture v1.3.0. <br>Raw sequence data were first cleaned by trimming adapters and low-quality bases (mean phred-scaled quality score <15 across 4 bp window) using Trimmomatic v0.35 (Bolger et al. 2014). Paired-end reads overlapping more than 10 bp and with lower than 10% mismatched bases were then merged using FLASH2 (Magoč and Salzberg 2011). Cleaned sequence data were then mapped to a snowshoe hare pseudoreference genome (see Jones et al. (2018) for details) using default settings in BWA-MEM v0.7.12 (Li 2013). We used PicardTools to remove duplicate reads with the MarkDuplicates function and assigned read group information with the AddOrReplaceReadGroups function. Using GATK v3.4.046 (McKenna et al. 2010), we then identified poorly aligned genomic regions with RealignerTargetCreator and locally realigned sequence data in these regions with IndelRealigner. We performed multi-sample variant calling for previously defined snowshoe hare populations using default settings with the GATK UnifiedGenotyper and performed variant filtration in VCFtools v0.1.14 (Danecek et al. 2011). We filtered genotypes with individual coverage <5x or >70x or with a phred-scaled quality score <30. Additionally, we removed all indel variants and filtered SNPs with a phred-scaled quality score <30, Hardy-Weinberg P<0.001. We required that sites have no missing data across individuals and that sites were > 10kb apart. We then created a .ped and .map file using the --plink flag in VCFtools.<br>
提供机构:
figshare
创建时间:
2020-03-25



