five

Data from: DiscoMark: Nuclear marker discovery from orthologous sequences using draft genome data

收藏
DataONE2016-07-19 更新2024-06-26 收录
下载链接:
https://search.dataone.org/view/null
下载链接
链接失效反馈
官方服务:
资源简介:
High-throughput sequencing has laid the foundation for fast and cost-effective development of phylogenetic markers. Here we present the program DISCOMARK, which streamlines the development of nuclear DNA (nDNA) markers from whole-genome (or whole-transcriptome) sequencing data, combining local alignment, alignment trimming, reference mapping and primer design based on multiple sequence alignments in order to design primer pairs from input orthologous sequences. In order to demonstrate the suitability of DISCOMARK we designed markers for two groups of species, one consisting of closely related species and one group of distantly related species. For the closely related members of the species complex of <i>Cloeon dipterum</i> s.l. (Insecta, Ephemeroptera), the program discovered a total of 78 markers. Among these, we selected eight markers for amplification and Sanger sequencing. The exon sequence alignments (2,526 base pairs (bp)) were used to reconstruct a well supported phylogeny and to infer clearly structured haplotype networks. For the distantly related species we designed primers for several families in the insect order Ephemeroptera, using available genomic data from four sequenced species. We developed primer pairs for 23 markers that are designed to amplify across several families. The DISCOMARK program will enhance the development of new nDNA markers by providing a streamlined, automated approach to perform genome-scale scans for phylogenetic markers. The program is written in Python, released under a public license (GNU GPL v2), and together with a manual and example data set available at: https://github.com/hdetering/discomark.

高通量测序(high-throughput sequencing)为快速且经济高效地开发系统发育标记奠定了坚实基础。本文介绍一款名为DISCOMARK的程序,可简化基于全基因组(或全转录组)测序数据开发核DNA(nDNA)标记的流程:其整合局部比对、比对修剪、参考序列映射以及基于多序列比对的引物设计模块,能够从输入的直系同源序列中设计引物对。为验证DISCOMARK的适用性,我们针对两类物种设计了系统发育标记:一类为近缘物种类群,另一类为远缘物种类群。针对广义拟短丝蜉(Cloeon dipterum s.l.,昆虫纲,蜉蝣目)物种复合群的近缘类群,该程序共筛选得到78个标记。我们从中选取8个标记进行扩增与桑格测序(Sanger sequencing),所得外显子序列比对片段总长2526碱基对(base pairs,缩写bp),以此构建了支持度良好的系统发育树,并推导得到结构清晰的单倍型网络。针对远缘物种类群,我们利用4个已测序物种的公开基因组数据,为昆虫纲蜉蝣目下的多个科设计了引物,最终开发出可跨多个科扩增的23个标记的引物对。DISCOMARK程序通过提供一套简化且自动化的全基因组尺度系统发育标记扫描方案,将推动新型核DNA标记的开发进程。该程序采用Python语言编写,以GNU GPL v2开源许可证发布,配套使用手册与示例数据集可通过以下链接获取:https://github.com/hdetering/discomark。
创建时间:
2016-07-19
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作