8. Ecological genomics of the Northern krill: Recombination rates and demographic history
收藏Figshare2024-03-28 更新2026-04-28 收录
下载链接:
https://figshare.com/articles/dataset/8_Ecological_genomics_of_the_Northern_krill_Recombination_rates_and_demographic_history/22825277
下载链接
链接失效反馈官方服务:
资源简介:
This item contains archives of data and results used to assess recombination rates (iSMC), demographic history (PSMC, MSMC) and haplotype ages (GEVA) using coalescent methods. Population definitionsPopulation definitions are the same as desribed in a different item: "at vs. me" = Atlantic Ocean samples (n=67) vs. the Mediterranean (i.e. Barcelona) samples (n=7). "we vs. ea" = South-West North Atlantic Ocean (n=20) vs. North-East North Atlantic Ocean (n=47). In files using this contrast, sometimes the label "wa" is used instead of "we" for the South-West North Atlantic Ocean samples. Contents: psmc_dataset.psmcfa.gz, datasets for PSMC-analyses containing signatures of heterozygosity in the reference specimen that were converted from VCF into the fasta-like PSMCFA format. msmc_datasets.tar.gz, datasets for MSMC-analyses containing signatures of heterozygosity in the reference specimen that were converted from VCF into TSV. ismc_dataset.tar.gz, the VCF dataset and accessory files for iSMC-analyses used to infer recombination rates. geva_datasets.candidates.at_vs_me.tar.gz, the re-coded VCF and binary format datasets as well as analysis output for the 660 candidate gene loci analyzed for "at" and "me" populations in the "at vs. me" contrast. geva_datasets.candidates.we_vs_ea.tar.gz, the re-coded VCF and binary format datasets as well as analysis output for the 34 candidate gene loci analyzed for "we" and "ea" populations in the "we vs. ea" contrast. geva_results.candidates.at_vs_me.tar.gz, the resulting age estimates of minor alleles in the "at vs. me" contrast. geva_results.candidates.we_vs_ea.tar.gz, the resulting age estimates of minor alleles in the "we vs. ea" contrast. psmc_dataset.psmcfa.gz A FASTA-like file that encodes the distribution of heterozygous genotypes across 4,911 sequences in the diploid reference specimen at the 10 bp window resolution. Character states are: N=a window with only inaccessible sites (i.e. missing data) T=a window with accessible data K=a window with accessible data and at least one heterozygous genotype This format is further documented on the site of the original tool: https://github.com/lh3/psmc msmc_datasets.tar.gz This archive contains one TSV file per sequence (n=5,176) that specify the distribution of heterozygous genotypes. It countains four fields. Example: seq_s_12039171TC name of sequence position of the heterozygous genotype number of accessible sites since the last heterozygous genotype the heterozygous genotype (only two a string with alleles in this case when analysing a single individual) This format is further documented on the site of the original tool: https://github.com/stschiff/msmc-tools/blob/master/msmc-tutorial/guide.md ismc_dataset.tar.gz This archive contains several files: 1.merged_contigs.vcf = specifies the distribution of heterozygous genotypes in VCF format 1.merged_contigs.tab = specifies the lengths of sequences (TSV format) 1.merged_contigs.bpp = the program control file with run-time parameters (TXT) 1.merged_contigs.fasta = specifies accessible and inaccessible sites ("N") in FASTA format 1.merged_contigs.out_estimates.txt = the summary results of the analysis (TXT) geva_datasets.candidates.at_vs_me.tar.gz and geva_datasets.candidates.we_vs_ea.tar.gz These archives hold data and results from analysing variant ages at each of the 660 or 34 candidate gene loci with divergent haplotypes in each of the two contrasts. For each locus, the files span: Two recoded VCF files. In the first file, the minor allele in one of the two populations (e.g. "at") was taken to represent the derived allele and coded as the ALT allele. In the second file, the minor allele in the other group (e.g. "me") was taken to represent the derived allele and coded as the ALT allele. Intermediate data files generated by GEVA by processing the VCF files (*.bin, *.marker.txt, *.sample.txt), including a log and err file. Results files (*.pairs.txt.gz and *.sites.txt). The "*.sites.txt" contain allele age estimates under mutation clock (M), recombination clock (R), and joint clock models (J). The format of these files are described on site of the original tool: https://github.com/pkalbers/geva geva_results.candidates.at_vs_me.tar.gz and geva_results.candidates.we_vs_ea.tar.gz These archives contains four TSV files each. For each population (e.g. "at") there are two files. One of them collects all minor allele age estimates under all three models and the other only for the joint model.
创建时间:
2024-03-28



