five

wuHMM: a robust algorithm to detect DNA copy number variation using long oligonucleotide microarray data

收藏
NIAID Data Ecosystem2026-03-07 收录
下载链接:
https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE10511
下载链接
链接失效反馈
官方服务:
资源简介:
Copy number variants (CNVs) are currently defined as genomic sequences that are polymorphic in copy number and range in length from 1,000 to several million base pairs. Among current array-based CNV detection platforms, long-oligonucleotide arrays promise the highest resolution. However, the performance of currently available analytical tools suffers when applied to these data because of the lower signal:noise ratio inherent in oligonucleotide-based hybridization assays. We have developed wuHMM, an algorithm for mapping CNVs from array comparative genomic hybridization (aCGH) platforms comprised of 385,000 to more than 3 million probes. wuHMM is unique in that it can utilize sequence divergence information to reduce the false positive rate (FPR). We apply wuHMM to 385K-aCGH, 2.1M-aCGH, and 3.1M-aCGH experiments comparing the 129X1/SvJ and C57BL/6J inbred mouse genomes. We assess wuHMM’s performance on the 385K platform by comparison to the higher resolution platforms and we independently validate 10 CNVs. The method requires no training data and is robust with respect to changes in algorithm parameters. At a FPR of less than 10%, the algorithm can detect CNVs with five probes on the 385K platform and three on the 2.1M and 3.1M platforms, resulting in effective resolutions of 24 kb, 2-5 kb, and 1 kb, respectively. Keywords: CNV detection algorithm development and assessment All four samples in this series are hybridizations of genomic DNA from inbred mouse strains 129X1/SvJ versus C57BL6/J. The experiments were performed at increasing resolutions (one 385K, two 2.1M, and one 3.1M).

拷贝数变异(Copy number variants, CNVs)目前被定义为拷贝数呈多态性、长度介于1000至数百万碱基对之间的基因组序列。在当前基于阵列的CNV检测平台中,长寡核苷酸阵列(long-oligonucleotide arrays)有望实现最高的分辨率。然而,由于基于寡核苷酸的杂交实验本身固有的信噪比更低,现有分析工具应用于此类数据时的表现欠佳。我们开发了wuHMM算法,用于从包含38.5万至超300万探针的阵列比较基因组杂交(array comparative genomic hybridization, aCGH)平台中定位CNVs。wuHMM的独特之处在于,它能够利用序列差异信息降低假阳性率(false positive rate, FPR)。我们将wuHMM应用于比较129X1/SvJ与C57BL/6J近交系小鼠基因组的385K-aCGH、2.1M-aCGH及3.1M-aCGH实验中。我们通过与更高分辨率平台的对比,在385K平台上评估wuHMM的性能,并独立验证了10个CNVs。该方法无需训练数据,且对算法参数的变化具有鲁棒性。在假阳性率低于10%的条件下,该算法可在385K平台上检测到包含5个探针的CNVs,在2.1M和3.1M平台上检测到包含3个探针的CNVs,对应的有效分辨率分别为24 kb、2~5 kb及1 kb。关键词:CNV检测算法的开发与评估本系列的四个样本均为129X1/SvJ与C57BL/6J近交系小鼠菌株的基因组DNA杂交实验。实验以递增的分辨率开展(1组385K、2组2.1M及1组3.1M)。
创建时间:
2012-03-19
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作