Data from: Unique evolutionary trajectories in repeated adaptation to hydrogen sulphide-toxic habitats of a neotropical fish (Poecilia mexicana)
收藏Mendeley Data2024-06-25 更新2024-06-27 收录
下载链接:
https://datadryad.org/stash/dataset/doi:10.5061/dryad.ts351
下载链接
链接失效反馈官方服务:
资源简介:
SNP FST fileThis is the pairwise fst output for individual snps. The non-informative sites have been suppressed, so all the rows are the SNPs that meet the coverage criteria. the columns are: col1: scaffold number col2: SNP position on scaffold col3: number of snps in window (because the window size was set to 1 to get individual SNP estimates, all values should be 1) col4: fraction of the window covered (more relevant for sliding window analyses) col5: mean coverage at SNP over all four populations col6: pairwise Fst for Tac-C:Tac-S col7: pairwise Fst for Tac-C:Puy-C col8: pairwise Fst for Tac-C:Puy-S col9: pairwise Fst for Tac-S:Puy-C col10: pairwise Fst for Tac-S:Puy-S col11: pairwise Fst for Puy-C:Puy-S population codes: 1=Tac-C, 2=Tac-S, 3=Puy-C, 4=Puy-S Note that the Fst values that are 0.000000000 are not polymorphic SNPs for a given pairwise comparison. For example, snp 'NW_006799939.1 19911' has an Fst of 0.00000000 in the Tac-C:Tac-S comparison because it is not polymorphic between Tac-C and Tac-S. From the sync file: NW_006799939.1 19911 A 21:0:0:0:0:0 21:0:0:0:0:0 39:0:0:2:0:0 37:0:0:5:0:0 you can see that Tac-C and Tac-S are fixed for the A allele (21 counts in each). The only reason it is included as a SNP in the .fst file is because it is polymorphic in the Puy-C & Puy-S populations (and polmorhic in the comparison across Tac & Puy).Tac-C_Tac-S_Puy-C_Puy-S.fstFisher's exact test dataThis is the fisher's exact test output for each SNP, in each pairwise comparison. The same SNP definition ws used as for the .fst output, so these are the FET results for the same snps included in the Fst output. The structure of the file is the same as for Tac-C_Tac-S_Puy-C_Puy-S.fst. Instead of Fst values, the numbers are -log10 P-values.Tac-C_Tac-S_Puy-C_Puy-S.fet1kb window FSTThis file contains the output of the 1000bp sliding window Fst analysis. Structure of the output file is similar to Tac-C_Tac-S_Puy-C_Puy-S.fst: col1: reference contig (chromosome) col2: mean position of the sliding window col3: number of SNPs found in the window (not considering sites with a deletion) col4: fraction of the window which has a sufficient coverage (min. coverage <= cov <= max. coverage) in every population; col5: average minimum coverage in all populations col6: 1:2 the pairwise Fst for population 1 and 2 col7: 1:3 the pairwise Fst for population 1 and 3 ....Tac-C_Tac-S_Puy-C_Puy-S.1000.fstPuy-C Tajima's DTajima's D output for 1000bp windows for Puy-C. There is no option in the script to suppress the non-informative windows, so there are lots of windows with "na" that fail the coverage criteria so no calculation is made.Puy-C.DPuy-S Tajima's DTajima's D output for 1000bp windows for Puy-S. There is no option in the script to suppress the non-informative windows, so there are lots of windows with "na" that fail the coverage criteria so no calculation is made.Puy-S.DTac-C Tajima's DTajima's D output for 1000bp windows for Tac-C. There is no option in the script to suppress the non-informative windows, so there are lots of windows with "na" that fail the coverage criteria so no calculation is made.Tac-C.DTac-S Tajima's DTajima's D output for 1000bp windows for Tac-S. There is no option in the script to suppress the non-informative windows, so there are lots of windows with "na" that fail the coverage criteria so no calculation is made.Tac-S.DinfilePuy68 for MigrateData input file for Migrate-N analysisinfilePuy68.txtinfileTac67 for Migrate-NData input file for Migrate-NinfileTac67.txtparmfile Migrate-NParameter file used for both population pairsparmfile.txtAllele frequency estimates from PoolSeq analysisPopoolation output file containing the read count data for polymorphic sites. The first column is the scaffold number (chr), the second the position on the scaffold (pos), the third is the base in the reference genome (rc), number of alleles (allele_count), allelic states (allele_states), number of deletions (deletion_sum), whether the snp is variable among populations or against the reference (snp_type), the major alleles in the populations in the order Tac-C, Tac-S, Puy-C, Puy-S (major_alleles(maa)), the same for minor alleles (minor_alleles(mia)), alleles frequency estimates of the major allele expressed as ratio of reads for the respective population (maa_1, maa_2, maa_3, maa_4), and the same for the minor alleles (mia_1, mia_2, mia_3, mia_4)Tac-C_Tac-S_Puy-C_Puy-S_rc
单核苷酸多态性(Single Nucleotide Polymorphism, SNP)固定指数(Fixation Index, Fst)文件
本文件为单个SNP位点的成对Fst输出结果。已筛除无信息位点,故所有行均为符合覆盖度标准的SNP。各列含义如下:列1:支架编号;列2:SNP在支架上的位置;列3:窗口内SNP数量(因窗口大小设为1以获取单个SNP的估计值,故所有值均为1);列4:窗口覆盖比例(更适用于滑动窗口分析);列5:所有4个种群在该SNP位点的平均覆盖度;列6:Tac-C与Tac-S的成对Fst值;列7:Tac-C与Puy-C的成对Fst值;列8:Tac-C与Puy-S的成对Fst值;列9:Tac-S与Puy-C的成对Fst值;列10:Tac-S与Puy-S的成对Fst值;列11:Puy-C与Puy-S的成对Fst值。种群代码对应关系:1=Tac-C,2=Tac-S,3=Puy-C,4=Puy-S。
注:Fst值为0.000000000的结果,代表该位点在对应成对比较中不属于多态性SNP。例如,SNP位点“NW_006799939.1 19911”在Tac-C与Tac-S的比较中Fst值为0.00000000,因二者在该位点均固定为A等位基因:从sync文件中可见“NW_006799939.1 19911 A 21:0:0:0:0:0 21:0:0:0:0:0 39:0:0:2:0:0 37:0:0:5:0:0”,Tac-C和Tac-S的读长计数均为21,均为A等位基因。该位点之所以被纳入.fst文件的SNP列表,是因为其在Puy-C与Puy-S种群中存在多态性,且在Tac种群与Puy种群的比较中也存在多态性。对应文件名为:Tac-C_Tac-S_Puy-C_Puy-S.fst
费希尔精确检验(Fisher's exact test, FET)数据
本文件为每个SNP在各成对比较中的费希尔精确检验输出结果。所用SNP定义与.fst输出文件一致,故仅包含Fst文件中收录的SNP。文件结构与Tac-C_Tac-S_Puy-C_Puy-S.fst完全相同,仅将Fst值替换为经过-log10转换的P值。对应文件名为:Tac-C_Tac-S_Puy-C_Puy-S.fet
1kb窗口Fst分析结果
本文件包含1000bp滑动窗口的Fst分析输出结果。文件结构与Tac-C_Tac-S_Puy-C_Puy-S.fst类似:列1:参考序列(染色体);列2:滑动窗口的平均位置;列3:窗口内检测到的SNP数量(不考虑存在缺失的位点);列4:所有种群均满足覆盖度阈值(最小覆盖度≤覆盖深度≤最大覆盖度)的窗口比例;列5:所有种群的平均最低覆盖度;列6:种群1与种群2的成对Fst值;列7:种群1与种群3的成对Fst值……对应文件名为:Tac-C_Tac-S_Puy-C_Puy-S.1000.fst
Puy-C种群Tajima's D检验结果
本文件为Puy-C种群的1000bp窗口Tajima's D检验输出结果。脚本未提供过滤非信息窗口的参数,故存在大量因未通过覆盖度标准而未计算得到结果的窗口,以“na”标识。对应文件名为:Puy-C.D
Puy-S种群Tajima's D检验结果
本文件为Puy-S种群的1000bp窗口Tajima's D检验输出结果。脚本未提供过滤非信息窗口的参数,故存在大量因未通过覆盖度标准而未计算得到结果的窗口,以“na”标识。对应文件名为:Puy-S.D
Tac-C种群Tajima's D检验结果
本文件为Tac-C种群的1000bp窗口Tajima's D检验输出结果。脚本未提供过滤非信息窗口的参数,故存在大量因未通过覆盖度标准而未计算得到结果的窗口,以“na”标识。对应文件名为:Tac-C.D
Tac-S种群Tajima's D检验结果
本文件为Tac-S种群的1000bp窗口Tajima's D检验输出结果。脚本未提供过滤非信息窗口的参数,故存在大量因未通过覆盖度标准而未计算得到结果的窗口,以“na”标识。对应文件名为:Tac-S.D
Migrate-N分析输入文件(Puy68)
本文件为Migrate-N分析所用的Puy68种群输入数据文件。对应文件名为:infilePuy68.txt
Migrate-N分析输入文件(Tac67)
本文件为Migrate-N分析所用的Tac67种群输入数据文件。对应文件名为:infileTac67.txt
Migrate-N参数文件
本文件为两组种群对分析共用的Migrate-N参数文件。对应文件名为:parmfile.txt
混合测序(Pooled Sequencing, PoolSeq)等位基因频率估计结果
本文件为Popoolation软件的输出文件,包含多态性位点的读长计数数据。各列含义如下:第1列:支架编号(染色体);第2列:该位点在支架上的位置;第3列:参考基因组中的碱基(rc);第4列:等位基因总数;第5列:等位基因状态;第6列:缺失序列的总计数;第7列:该SNP是否在种群间或与参考基因组间存在变异(snp_type);第8列:按Tac-C、Tac-S、Puy-C、Puy-S顺序排列的各种群主要等位基因(major_alleles, maa);第9列:各种群次要等位基因(minor_alleles, mia);第10至13列:各主要等位基因在对应种群中的读长占比频率估计值(maa_1、maa_2、maa_3、maa_4);第14至17列:各次要等位基因在对应种群中的读长占比频率估计值(mia_1、mia_2、mia_3、mia_4)。对应文件名为:Tac-C_Tac-S_Puy-C_Puy-S_rc
创建时间:
2023-06-28



