Population structure, ancestral admixture, gene flow, and landscape association of blacklegged ticks during range expansion in the Midwestern U.S.
收藏NIAID Data Ecosystem2026-05-02 收录
下载链接:
http://datadryad.org/dataset/doi%253A10.5061%252Fdryad.c866t1gh7
下载链接
链接失效反馈官方服务:
资源简介:
42 populations totalling 517 individuals of Ixodes scapularis from different spatial locations were sampled and sequenced to study the neutral variation, population structure, ancestral admixture, genetic connectivity, and landscape influences on gene flow. We began with genomic data preprocessing, variant calling, variant filtering and concordance check. Then we used the finalized dataset in variant call format (VCF) and spatial locations to conduct genetic distance statistics, isolation by distance modeling and calculate summary statstics. Further we used VCF and sample metadata to conduct Pincipal Component analysis and clustering analysis for understanding population structure and ancestral admixture. To understand region-wide gene flow connectivity, we conducted effective migration surface analysis and graph network analyses to visualize dispersal route and extent. Lastly, we processed landscape and ecological data to conduct landscape genomic analyses to understand the impact of landscape on gene flow, and visualized routes of dispersal across favorable environmental conditions.
Methods
Table of Contents
Part 1. Genomic Data Preprocessing, Variant Calling, Variant filtering and concordance
1A. Demultiplex and assign ID - PROCESS_RADTAGS in STACKS 2.64
1B. Adapter removal using CUTADAPT 3.5
1C. Read trimming with Trimmomatic 0.39
1D. Read mapping with reference genome using bwa-mem 0.7.17-r1188
1E. Alignment file conversion, and then resequencing bam merge - SAMTOOLS 1.16.1
1F. Variant calling - GATK 4.4.0.0
1G. Sample missingness filtering - PLINK2 v2.00a3 SSE4.2
1H. Variant filtering - GATK 4.4.0.0
1I. Variant missingness filtering - VCFTOOLS 0.1.17
1J. Genotype imputation in BEAGLE v5.4
1K. Genotype concordance between non-amplified and amplified ticks - GATK
Part 2. Genetic distance statistics, Isolation by distance and Summary stats
2A. Genetic distance via ADAGENET 2.1.10, GRAPH4LG 1.8.0, and MMOD 1.3.3
2B. Isolation by distance and mantel correlogram
2C. Expected heterozygosity and nucleotide diversity via POPULATIONS in STACKS 2.64
2D. Tajima's D via DADI
2E. Rarefied private alleles via ADZE-1.0
Part 3. Population structure via PCA, SNMF, and CONSTRUCT
3A. Principal component analysis
3B. SNMF via LEA 3.12.2
3C. CONSTRUCT 1.0.5
Part 4. Effective Migration surface and Graph Networks
4A. Effective Migration surface via FEEMS
4B. Graph Networks via GRAPH4LG
Part 5. Landscape ecological data collection and processing
5A. Landscape data download, reformat, variable selection, NA cell treatment, and aggregation
Part 6. Isolation by resistance (IBR) and environment (IBE) analyses
6A. Isolation by resistance modeling, model selection, and resistance mapping via RADISH
6B. Current Flow map via CIRCUITSCAPE and visualization
6C. Isolation by Environment modeling, and co-estimating with Isolation by Distance and Isolation by Resistance via MMRR implemented in ALGATR
创建时间:
2025-01-09



