five

Analysis options for Next Generation Whole Genome Sequencing

收藏
NIAID Data Ecosystem2026-03-10 收录
下载链接:
https://www.ncbi.nlm.nih.gov/bioproject/PRJNA436453
下载链接
链接失效反馈
官方服务:
资源简介:
According to the literature search, WGS is currently applied in medical genetics and heritable monogenic disorders, which interrogate germ line variants. Studies reported in the previous paragraph usually conduct variant calling following gold standard GATK pipeline (https://software.broadinstitute.org/gatk/best-practices/), additionally supported in cancer studies by somatic variant callers [do Valle and others 2016]. From technical point of view, the critical step in such pipeline is variant calling, which must be precise, adequate to WGS coverage and to the type of experiment. Our literature search indicates that the most accurate variant calls for 30x human WGS were recently reported by PrecisionFDA Truth Challenge (https://precision.fda.gov/challenges/truth/results). F-score values (Harmonic mean of recall and precision) reached 99.9587% for single nucleotide variants (SNV) and 99.4009% for short indels. DeepVariant tool [Poplin et al., biorxiv] which won the challenge is the first variant calling method which uses TensorFlow machine learning. Thus we ask a question if introduction of machine learning to medical genomics could significantly improve variant analysis precision. To test this hypothesis, we compare DeepVariant tool to recently used methods using independent NA12878 sample sequenced in our laboratory. Furthermore, we used the newest GRCh38.p10 reference genome [Speir and others 2016] whereas Poplin et al. called variants using GRCh37. Results generated by DeepVariant were compared to GATK 4.0 (gold standard pipeline) and Speedseq (efficient pipeline).
创建时间:
2018-03-01
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作