Exome-wide benchmark of difficult-to-sequence regions using short-read next-generation DNA sequencing
收藏NIAID Data Ecosystem2026-05-01 收录
下载链接:
https://www.ncbi.nlm.nih.gov/sra/DRP010834
下载链接
链接失效反馈官方服务:
资源简介:
Next-generation DNA sequencing (NGS) in a short-read mode has been recently used for genetic testing in various clinical settings. NGS data accuracy is crucial in clinical settings and several reports regarding quality control of NGS data, focusing mostly on establishing NGS sequence read accuracy, have been published thus far. Variant calling is another critical source of NGS errors that remains mostly unexplored despite its established significance. In this study, we used a machine-learning based method to establish an exome-wide benchmark of difficult-to-sequence regions using 10 genome sequence features on the basis of real-world NGS data accumulated in The Genome Aggregation Database (gnomAD) of the human reference genome sequence (GRCh38/hg38). We used the obtained metrics, designated 'UNMET score,' along with other lines of structural information of the human genome to identify difficult-to-sequence genomic regions using conventional NGS. Thus, the UNMET score could provide appropriate caveats to address potential sequential errors in protein-coding exons of the human reference genome sequence of GRCh38/hg38 in clinical sequencing.
创建时间:
2023-12-07



