HapMap Exome
收藏NIAID Data Ecosystem2026-03-07 收录
下载链接:
https://www.ncbi.nlm.nih.gov/sra/SRP007298
下载链接
链接失效反馈官方服务:
资源简介:
Large amounts of exome data are currently being generated from controls and disease individuals. While several analytical methods have been developed to routinely detect point mutations and small INDELs up to 10 bp, larger events are more difficult to discover because of their computational expense. We developed an algorithm to detect smaller INDELs and structural variants ranging in length from 1 bp up to 1 Mbp within exome sequence datasets using a split-read approach. The method requires paired-end mapping data and searches specifically for clusters of one-end anchored placements using mrsFAST alignments. Our approach decomposes the unanchored end into smaller subsequences and maps them locally to identify the size, content, and location of the structural variant within exons. Using this algorithm, we construct a computational pipeline and apply it to exome datasets we generated from samples sequenced as part of the 1000 Genomes Project. Comparing to genome sequence data, we show good specificity (70%) and high sensitivity (87%) including the discovery and validation of a significant number of novel indels and structural variants within protein-coding sequence. Using parent-child trio exome sequencing datasets, we show how split-read and read-depth approaches can be combined to enrich for de novo mutational events. Our method can detect both indels and structural variants irrespective of the size of an event and can recover events within low complexity and repetitive regions missed by other methods.
创建时间:
2013-08-23



