Diagnostic utility of the T2T reference genome
收藏NIAID Data Ecosystem2026-05-10 收录
下载链接:
https://www.ncbi.nlm.nih.gov/sra/ERP188130
下载链接
链接失效反馈官方服务:
资源简介:
Background: Structural variants (SVs) are increasingly recognised as key contributors to human diseases. However, our understanding of SVs in health and disease is limited, mainly due to their structural complexity and variable length in individuals as well as limitations inherent to the available genomic technologies and reference genome used. Results: To systematically evaluate SVs across human whole-genome samples using hg38/GRCh38 and gapless T2T-CHM13 references, we introduced an innovativemultiplatform approach, LongReadChecker (LoReC), which advances SV comparison and annotation based on distance variance, intersection, gene overlap and the closest SV in the clinical database. Comparison of the performance in detecting SVs frompublic and our own whole-genome datasets fromshort-read sequencing (SRS), available long-read sequencing (LRS) platforms and optical genome mapping (OGM) revealed thatmost SVs detected by SRS were confirmed by LRS, but LRS can identify twice as many SVs (25,000 SVs/genome) with greater read mapping accuracy. Our LoReC analysis further highlights the utility of the T2T-CHM13 reference in SV detection, as 20%more deletions and 20% less insertions were detected compared with hg38/GRCh38, which was particularly evident in long-read datasets. Since 80% of the SVs detected by LRS/SRS are smaller than 0.5 kbp, OGM did not detect them. Conclusions: Our study revealed that introducing distance variance, intersection, gene overlap and the closest SV in the clinical database may help compare and annotate SVs in diagnostics. Our data showed that LRS together with T2T-CHM13 gapless sequences can improve the diagnostics of patients with human diseases when SRS fails to identify the cause.
创建时间:
2026-02-11



