five

HapMap Exome

收藏
NIAID Data Ecosystem2026-03-07 收录
下载链接:
https://www.ncbi.nlm.nih.gov/sra/SRP007298
下载链接
链接失效反馈
官方服务:
资源简介:
Large amounts of exome data are currently being generated from controls and disease individuals. While several analytical methods have been developed to routinely detect point mutations and small INDELs up to 10 bp, larger events are more difficult to discover because of their computational expense. We developed an algorithm to detect smaller INDELs and structural variants ranging in length from 1 bp up to 1 Mbp within exome sequence datasets using a split-read approach. The method requires paired-end mapping data and searches specifically for clusters of one-end anchored placements using mrsFAST alignments. Our approach decomposes the unanchored end into smaller subsequences and maps them locally to identify the size, content, and location of the structural variant within exons. Using this algorithm, we construct a computational pipeline and apply it to exome datasets we generated from samples sequenced as part of the 1000 Genomes Project. Comparing to genome sequence data, we show good specificity (70%) and high sensitivity (87%) including the discovery and validation of a significant number of novel indels and structural variants within protein-coding sequence. Using parent-child trio exome sequencing datasets, we show how split-read and read-depth approaches can be combined to enrich for de novo mutational events. Our method can detect both indels and structural variants irrespective of the size of an event and can recover events within low complexity and repetitive regions missed by other methods.
创建时间:
2013-08-23
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作