Supporting data for "LT1, an ONT long-read-based assembly scaffolded with Hi-C data and polished with short reads"
收藏DataCite Commons2025-05-26 更新2024-07-13 收录
下载链接:
http://gigadb.org/dataset/100979
下载链接
链接失效反馈官方服务:
资源简介:
We present LT1, the first high-quality human reference genome from the Baltic States. LT1 is a female <i>de novo</i> human reference genome assembly constructed using 57x nanopore long reads and polished using 47x short paired-end reads. We utilized 72 Gb of Hi-C chromosomal mapping data for scaffolding, to maximize the assemblys contiguity and accuracy. LT1s contig assembly was 2.73 Gbp in length consisting of 4,490 contigs with an NG50 value of 12.0 Mbp. After scaffolding with Hi-C data and extensive manual curation, we produced an assembly with an NG50 value of 137 Mbp and 4,699 scaffolds. Our gene prediction quality assessment using BUSCO identified 89.3% of the single-copy orthologous genes included in the benchmark. Detailed characterization of LT1 has suggested that it has 73,744 predicted transcripts, 4.2 million autosomal SNPs, 974,616 short indels, and 12,079 large structural variants. These data are shared as a public resource without any restrictions and can be used as a benchmark for further in-depth genomic analyses of the Baltic populations.
提供机构:
GigaScience Database
创建时间:
2022-04-12



