five

GNRS (Graph nonreference sequences) Pangenome of human nonreference sequences from population-scale long-read sequencing

收藏
NIAID Data Ecosystem2026-05-01 收录
下载链接:
https://zenodo.org/record/10554484
下载链接
链接失效反馈
官方服务:
资源简介:
In order to make sure the results are reproduceable, the pipeline is performed using framework Snakemake coupled with the environment conducted by Anoconda. And the pipeline can be used in other cohort with long-read sequencing. Schematic representation of GraphNRS a, Long-read sequencing data from different platforms are de novo assembled and polished.b, The NRSs are anchored to GRCh38. Placed NRSs are clustered to select the representative NRSs, and unplaced NRSs are clustered after filtering out contaminants and centromeric repeats. Then, we merge the placed and the unplaced NRSs to obtain the nonredundant NRSs of the whole population.c, vg is used to construct the graph pangenome, and NRS genotyping is performed for each NRS of the individual. Requirements 1. wtdbg2 v2.52. MarginPolish v1.3.03. Hifiasm v0.16.1-r3754. NextPolish v1.4.0 5. QUAST v5.0.26. AGE v0.47. Kalign v3.38. Jasmine v1.1.09. vg toolkit v1.33.110. GraphAligner v1.0.1311. snakemake v7.2.1 Data availabilityThe sequencing data for all 539 individuals in this study are publicly available. Detail information about these datasets is provided in Supplementary Table 1. The sequences and genotypes of the nonredundant NRSs are publicly accessible through the National Genomics Data Center (NGDC), China National Center for Bioinformation (CNCB), with the accession number GVM000672 (https://ngdc.cncb.ac.cn/gvm/getProjectDetail?project=GVM000672). Additionally, the data are available on GitHub at https://github.com/xie-lab/GNRS/tree/main/NRS. The codes of pipeline GraphNRS in this study are publicly available via GitHub repository (https://github.com/xie-lab/GNRS). CitationWu Z, Li T, Jiang Z, Zheng J, Gu Y, Liu Y, Liu Y, Xie Z. Human pangenome analysis of sequences missing from the reference genome reveals their widespread evolutionary, phenotypic, and functional roles. Nucleic Acids Res. 2024 Feb 14:gkae086. doi: 10.1093/nar/gkae086. Epub ahead of print. PMID: 38364871.
创建时间:
2024-02-18
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作