five

Local genetic adaptation to habitat in wild chimpanzees

收藏
DataCite Commons2025-01-10 更新2025-04-17 收录
下载链接:
https://rdr.ucl.ac.uk/articles/dataset/Local_genetic_adaptation_to_habitat_in_wild_chimpanzees/26840767/1
下载链接
链接失效反馈
官方服务:
资源简介:
This directory contains data used in Ostridge et al. (2025) likelihoods, <i>Science.</i><br><b>genotype_likelihoods/data</b>- This directory contains ANGSD output files for all the SNPs and populations that passed filtering in Ostridge et al. Note that these files are not used in the pipeline of this analysis, instead they were generated afterwards to create a dataset of genotype likelihoods (GL) for SNPs that passed quality filtering in the Ostridge et al. analysis.- Sequence data used to generate these files is available on ENA under the accession code ENA:PRJEB76176,- GL files are the *.beagle.gz files.- *.sites.file lists the SNPs that passed filtering in Ostridge et al., this was parsed to the ANGSD -sites option.- All other file types are outputted by ANGSD (details at https://www.popgen.dk/angsd/index.php/ANGSD).- *.arg files show the arguments given to ANGSD.- *.hwe.gz files show the Hardy-Weinberg equilibrium p-values per SNP.- *.mafs.gz files contain the minor allele frequency estimates.- *.snpStat.gz files contain a range of statistics calculated for each SNP, including sequencing depth.<br><b>genotype_likelihoods/data/f5.0.5x.all_doMajorMinor.1_sites.and.pops.to.match.baypass.input.minMAC2_beagle</b>- This contains the ANGSD output files for SNPs in the All subspecies dataset in Ostridge et al.- This was generated using genotype_likelihoods/scripts/run_angsd_f5.0.5x.all_doMajorMinor.1_sites.and.pops.to.match.baypass.input.minMAC2_beagle_2024.sh.- angsd_f5.0.5x.all_doMajorMinor.1_sites.and.pops.to.match.baypass.input.minMAC2_beagle_2024.e576361 shows the command line outputs when ANGSD was run.<br><b>genotype_likelihoods/data/f5.0.5x.all_doMajorMinor.1_sites.and.pops.to.match.baypass.input.minMAC2_beagle</b>- This contains the ANGSD output files for each of the subspecies specific subspecies-dataset in Ostridge et al.- These are found in corresponding subdirectories; c=Central, e=Eastern, ce=Central-Eastern, n=Nigeria-Cameroon, and w=Western.- This was generated using genotype_likelihoods/scripts/run_angsd_f5.0.5x.subsp_doMajorMinor.1_sites.and.pops.to.match.baypass.input.minMAC2_beagle_2024.sh.- angsd_f5.0.5x.subsp_doMajorMinor.1_sites.and.pops.to.match.baypass.input.minMAC2_beagle_2024.e576327 shows the command line outputs when ANGSD was run.<br><br><b>gowinda_association_files/</b>- This directory contains the association files used for the gowinda gene set enrichment analysis in Ostridge et al.- Full details of how these are generated are found at https://github.com/HarrJO/PanAf-local-adaptation/blob/main/gowinda/baypass_core/scripts/run_gowinda.v23.Rmd.<br>- KEGG.biosystems.gene.set_and.all.txt - KEGG categories- dehy_association.gene.set_and.all.txt - dehydration response categories- imm_association.gene.set_and.all.txt - immunity genes- phen_association.gene.set_and.all.txt - phenotype database- REAC.biosystems.gene.set_and.all.txt - Reactome database- expr_association.gene.set_and.all.txt - tissue expression data- mergedlthtvip.gene.set_and.all.txt - viral interacting proteins (VIPs)- association_gominer.txt - gene ontology (GO) categories- gwas_association.gene.set_and.all.txt - GWAS traits- pathogen_ebel2017_association.gene.set_and.all.txt - pathogen-related genes<br><br><b>habitat_data/population_forest-tree-percentage.csv</b>- This contains the forest-tree-percentage values used for each population in the BayPass GEA.

本目录包含Ostridge等人(2025)发表于《Science》的研究中所用的似然数据。<br><b>genotype_likelihoods/data</b>- 本目录包含Ostridge等人研究中通过过滤的所有单核苷酸多态性(single nucleotide polymorphism, SNP)和群体的ANGSD输出文件。请注意,这些文件未用于本分析流程,而是后续生成的,旨在为Ostridge等人分析中通过质量过滤的SNP构建基因型似然值(genotype likelihoods, GL)数据集。- 生成这些文件所用的序列数据可在ENA数据库中获取,登录号为ENA:PRJEB76176。- GL文件为*.beagle.gz格式。- *.sites.file列出了Ostridge等人研究中通过过滤的SNP,该文件被解析至ANGSD的-sites选项。- 其余所有文件类型均为ANGSD输出(详情见https://www.popgen.dk/angsd/index.php/ANGSD)。- *.arg文件显示传递给ANGSD的参数。- *.hwe.gz文件显示每个SNP的哈迪-温伯格平衡(Hardy-Weinberg equilibrium)p值。- *.mafs.gz文件包含次要等位基因频率(minor allele frequency)估计值。- *.snpStat.gz文件包含为每个SNP计算的一系列统计量,包括测序深度(sequencing depth)。<br><b>genotype_likelihoods/data/f5.0.5x.all_doMajorMinor.1_sites.and.pops.to.match.baypass.input.minMAC2_beagle</b>- 本目录包含Ostridge等人研究中所有亚种数据集内SNP的ANGSD输出文件。- 该目录通过genotype_likelihoods/scripts/run_angsd_f5.0.5x.all_doMajorMinor.1_sites.and.pops.to.match.baypass.input.minMAC2_beagle_2024.sh脚本生成。- angsd_f5.0.5x.all_doMajorMinor.1_sites.and.pops.to.match.baypass.input.minMAC2_beagle_2024.e576361文件显示ANGSD运行时的命令行输出。<br><b>genotype_likelihoods/data/f5.0.5x.all_doMajorMinor.1_sites.and.pops.to.match.baypass.input.minMAC2_beagle</b>- 本目录包含Ostridge等人研究中每个亚种特异性数据集的ANGSD输出文件。- 这些文件位于相应子目录中:c=中部、e=东部、ce=中东部、n=尼日利亚-喀麦隆、w=西部。- 该目录通过genotype_likelihoods/scripts/run_angsd_f5.0.5x.subsp_doMajorMinor.1_sites.and.pops.to.match.baypass.input.minMAC2_beagle_2024.sh脚本生成。- angsd_f5.0.5x.subsp_doMajorMinor.1_sites.and.pops.to.match.baypass.input.minMAC2_beagle_2024.e576327文件显示ANGSD运行时的命令行输出。<br><br><b>gowinda_association_files/</b>- 本目录包含Ostridge等人研究中用于Gowinda基因集富集分析(gene set enrichment analysis)的关联文件。- 这些文件的生成详情见https://github.com/HarrJO/PanAf-local-adaptation/blob/main/gowinda/baypass_core/scripts/run_gowinda.v23.Rmd。<br>- KEGG.biosystems.gene.set_and.all.txt - KEGG类别<br>- dehy_association.gene.set_and.all.txt - 脱水响应类别<br>- imm_association.gene.set_and.all.txt - 免疫基因<br>- phen_association.gene.set_and.all.txt - 表型数据库<br>- REAC.biosystems.gene.set_and.all.txt - Reactome数据库<br>- expr_association.gene.set_and.all.txt - 组织表达数据<br>- mergedlthtvip.gene.set_and.all.txt - 病毒互作蛋白(viral interacting proteins, VIPs)<br>- association_gominer.txt - 基因本体论(gene ontology, GO)类别<br>- gwas_association.gene.set_and.all.txt - 全基因组关联研究(genome-wide association study, GWAS)性状<br>- pathogen_ebel2017_association.gene.set_and.all.txt - 病原体相关基因<br><br><b>habitat_data/population_forest-tree-percentage.csv</b>- 本文件包含BayPass基因环境关联分析(gene-environment association, GEA)中每个群体所用的森林树木百分比值。
提供机构:
University College London
创建时间:
2025-01-02
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作