Genomic variant data and codes used for analysis in the manuscript - Whole genome sequencing reveals the structure of environment associated divergence in a broadly distributed montane bumble bee, Bombus vancouverensis
收藏DataCite Commons2025-06-01 更新2024-07-29 收录
下载链接:
https://figshare.com/articles/dataset/b_vanc_fully_filtered_100k_plus_recode_vcf_gz/20310522/2
下载链接
链接失效反馈官方服务:
资源简介:
See below for details of the files included below. <br> delly_vanc.vcf.gz # Raw output of Delly <br> b.vanc.fully.filtered.100k.plus.recode.vcf.gz # output of freebayes which was filtered using VCFtools v0.1.13 (Danecek et al. 2011) with the following flags: --remove-indels --min-alleles 2 --max-alleles 2 --minQ 20 --minDP 4 --max-missing 0.75 #Above file was also filtered to remove sites with unusually high coverage (>2x mean coverage) or excess heterozygosity. Finally SNPs that fell on scaffolds less than 100kb in length were removed <br> b.vanc.fully.filtered.100k.plus.recode.maf05.recode.ANN.vcf.gz #Fully filtered variant file (see manuscript for details) with annotation information <br> b.vanc.fully.filtered.100k.plus.recode.maf05.recode.impute.vcf.gz #Fully filtered variant file (see manuscript for details) after imputation with beagle <br> #### Description of each script contained in this directory #### <br> Trim_N_QC.sh #Trim raw sequencing data and run fastQC to evaluate trimmed data <br> BWA_PICARD_vanc1.sh #Example of script used to align sequence data to the reference genome using BWA. Also, uses Picard tools to sort, deduplicate and index bam files <br> P_call_test-2-vanc.sh #First part of pipeline for calling SNPS with freebayes (calls freebayes-parallel-part1_vanc.sh) <br> freebayes-parallel-part1_vanc.sh #see above <br> Filter_vanc.sh #Create list of SV's to filter from DELLY output <br> filter_delly.sh #filter based on generated list of SV's <br> delly_vanc.sh #call SV's using DELLY <br> bcf2vcf.sh # convert bcf from DELLY to vcf format <br> freebayes-parallel-part2.sh #Second part of freebayes pipeline <br> merge_vanc_vars.sh #Second part of freebayes pipeline (calls freebayes-parallel-part2.sh) <br> site_depth_vanc.sh #Gets site depth per SNP <br> remove_highdepth_vanc.sh #removes SNPs above depth threshold <br> hardy_vanc.sh #calculates HWE per SNP <br> remove_hwe_vanc.sh #removes SNPs based on HWE threshold <br> filter_vcf_size.sh #Removes SNPs on scaffolds less than 100Kb in size <br> filter_vcf_maf05.sh #filters SNPs based on 5% MAF filter <br> beagle.sh #imputes using beagle <br> LEA_con.R #converts vcf file into LFMM and geno format <br> Snpeff_ANN.sh # annotate vcf file using SNPeff <br> plink_for_sambaR.sh # convert vcf file into format ready for use in sambaR <br> LD_test.sh #example of script used to calculate LD per scaffold <br> vcf_stats.sh #Gets various stats from final filtered vcf <br> get_pi_diversity.sh #gets per population nucleotide diversity <br> sambaR.R #Runs SambaR <br> lfmm2_analysis.R #Code for running analysis on output of LFMM2 and generating graphs <br> Max_ent_map.R #Generates maxent map <br> RDA_script.R #Code for RDA analysis of structural variants <br> snprelate_script.R #runs SNPrelate as well as makes graphs of Fst and pi along scaffolds of interest <br> repeat_correctedfst.R #Analysis for correlation between repeat density and Fst <br> LD_script.R #analysis of linkage
提供机构:
figshare
创建时间:
2022-07-14



