Supporting data for "KOREF_S1: the phased, parental Trio-binned Korean reference genome using long-reads and Hi-C sequencing methods"
收藏DataCite Commons2025-05-26 更新2025-04-15 收录
下载链接:
http://gigadb.org/dataset/100983
下载链接
链接失效反馈官方服务:
资源简介:
KOREF is the Korean reference genome which was constructed with various sequencing technologies including long reads, short reads, and optical mapping methods. It is also the first East Asian multiomic reference genome accompanied by extensive clinical information, time series and multiomic data, and his parental sequencing data. However, it was still not a chromosome-scale reference. Here, we updated the previous KOREF assembly to a new chromosome-level haploid assembly of KOREF, KOREF_S1v2.1. ONT PromethION, PacBio HiFi-CCS, and Hi-C technology were used to build the most accurate East Asian reference assembled so far.<br>We produced 705 Gb ONT reads and 114 Gb PacBio HiFi reads, and corrected ONT reads by PacBio reads. The corrected ultra-long reads reached higher accuracy of 1.4% base-errors than the previous KOREF_S1v1.0, which was mainly built with short reads. KOREF has parental genome information, and we successfully phased it using a trio-binning method acquiring a near-complete haploid-assembly. The final assembly resulted in total length of 2.9 Gb with an N50 of 150 Mb, and the longest scaffold covered 97.3% of GRCh38s chromosome 2. And the final assembly showed high base accuracy, less than 0.01% of base-errors.<br>KOREF_S1v2.1 is the first chromosome-scale haploid assembly of the Korean reference genome with high contiguity and accuracy. Our study provides useful resources of the Korean reference genome and demonstrates a new strategy of hybrid assembly which collaborates ONTs PromethION and PacBios HiFi-CCS.
提供机构:
GigaScience Database
创建时间:
2022-02-07



