Genomic variant data and codes used for analysis in the manuscript - Whole genome sequencing reveals the structure of environment associated divergence in a broadly distributed montane bumble bee, Bombus vancouverensis
收藏NIAID Data Ecosystem2026-03-13 收录
下载链接:
https://figshare.com/articles/dataset/b_vanc_fully_filtered_100k_plus_recode_vcf_gz/20310522
下载链接
链接失效反馈官方服务:
资源简介:
See below for details of the files included below.
delly_vanc.vcf.gz # Raw output of Delly
b.vanc.fully.filtered.100k.plus.recode.vcf.gz # output of freebayes which was filtered using VCFtools v0.1.13 (Danecek et al. 2011) with the following flags: --remove-indels --min-alleles 2 --max-alleles 2 --minQ 20 --minDP 4 --max-missing 0.75
#Above file was also filtered to remove sites with unusually high coverage (>2x mean coverage) or excess heterozygosity. Finally SNPs that fell on scaffolds less than 100kb in length were removed
b.vanc.fully.filtered.100k.plus.recode.maf05.recode.ANN.vcf.gz #Fully filtered variant file (see manuscript for details) with annotation information
b.vanc.fully.filtered.100k.plus.recode.maf05.recode.impute.vcf.gz #Fully filtered variant file (see manuscript for details) after imputation with beagle
#### Description of each script contained in this directory ####
Trim_N_QC.sh #Trim raw sequencing data and run fastQC to evaluate trimmed data
BWA_PICARD_vanc1.sh #Example of script used to align sequence data to the reference genome using BWA. Also, uses Picard tools to sort, deduplicate and index bam files
P_call_test-2-vanc.sh #First part of pipeline for calling SNPS with freebayes (calls freebayes-parallel-part1_vanc.sh)
freebayes-parallel-part1_vanc.sh #see above
Filter_vanc.sh #Create list of SV's to filter from DELLY output
filter_delly.sh #filter based on generated list of SV's
delly_vanc.sh #call SV's using DELLY
bcf2vcf.sh # convert bcf from DELLY to vcf format
freebayes-parallel-part2.sh #Second part of freebayes pipeline
merge_vanc_vars.sh #Second part of freebayes pipeline (calls freebayes-parallel-part2.sh)
site_depth_vanc.sh #Gets site depth per SNP
remove_highdepth_vanc.sh #removes SNPs above depth threshold
hardy_vanc.sh #calculates HWE per SNP
remove_hwe_vanc.sh #removes SNPs based on HWE threshold
filter_vcf_size.sh #Removes SNPs on scaffolds less than 100Kb in size
filter_vcf_maf05.sh #filters SNPs based on 5% MAF filter
beagle.sh #imputes using beagle
LEA_con.R #converts vcf file into LFMM and geno format
Snpeff_ANN.sh # annotate vcf file using SNPeff
plink_for_sambaR.sh # convert vcf file into format ready for use in sambaR
LD_test.sh #example of script used to calculate LD per scaffold
vcf_stats.sh #Gets various stats from final filtered vcf
get_pi_diversity.sh #gets per population nucleotide diversity
sambaR.R #Runs SambaR
lfmm2_analysis.R #Code for running analysis on output of LFMM2 and generating graphs
Max_ent_map.R #Generates maxent map
RDA_script.R #Code for RDA analysis of structural variants
snprelate_script.R #runs SNPrelate as well as makes graphs of Fst and pi along scaffolds of interest
repeat_correctedfst.R #Analysis for correlation between repeat density and Fst
LD_script.R #analysis of linkage
创建时间:
2022-07-14



