five

Human-specific tandem repeat expansion and differential gene expression during primate evolution

收藏
NIAID Data Ecosystem2026-04-25 收录
下载链接:
https://zenodo.org/record/3401476
下载链接
链接失效反馈
官方服务:
资源简介:
THIS DATASET IS PART OF THE FOLLOWING STUDY: https://www.pnas.org/content/early/2019/10/22/1912175116   THE RAW SEQUENCING 10x GENOMICS READS CAN BE DOWNLOADED FROM SRA: https://www.ncbi.nlm.nih.gov/bioproject/PRJNA593056 ORIGINAL UPLOAD: 09/06/2019 UPDATES: 10/28/2019; 01/27/2020 DESCRIPTION: Contigs were assembled using Phased-SV (Chaisson et al, Nature Communications 2019) on six human haplotypes (i.e., H0 and H1 in NA19240, HG00514, and HG00733), and six nonhuman haplotypes (this study, H0 and H1 in Clint the chimpanzee, Kamilah the gorilla, and Susie the orangutan). The long read data (PacBio CLR) from NHPs were phased into haplotypes H0 and H1 using linked reads from 10X Genomics prior to assembly, whenever possible. If not possible (e.g., in the case of long runs of homozygosity regions), long reads from both haplotypes were used to generate a "squished assembly". Using human haplotype data, we identified  21,442 polymorphic STRs/VNTRs, followed by a targetted phasing of these regions in the three NHPs. All of the human and nonhuman primate contigs were padded by 2 kbp both upstream and downstream, followed by mapping against the human reference (GRCh38). We did the same for "squished assemblies" from a Yoruban individual, CHM13, and three NHPs as described in Kronenberg et al, Science 2018. The BAM and BAI files in this dataset contain the alignment of all these contigs against GRCh38.
创建时间:
2020-01-28
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作