five

TreeOTU

收藏
Figshare2016-01-18 更新2026-04-08 收录
下载链接:
https://figshare.com/articles/dataset/TreeOTU/783077/4
下载链接
链接失效反馈
官方服务:
资源简介:
Our current understanding of the taxonomic and phylogenetic diversity of cellular organisms, especially the bacteria and archaea, is mostly based upon studies of sequences of the small- subunit rRNAs (ssu-rRNAs). To address the limitation of ssu-rRNA as a phylogenetic marker, such as copy number variation among organisms and complications introduced by horizontal gene transfer, convergent evolution, or evolution rate variations, we have identified protein- coding gene families as alternative Phylogenetic and Phylogenetic Ecology markers (PhyEco). Current nucleotide sequence similarity based Operational Taxonomic Unit (OTU) classification methods are not readily applicable to amino acid sequences of PhyEco markers. We report here the development of TreeOTU, a phylogenetic tree structure based OTU classification method that takes into account of differences in rates of evolution between taxa and between genes. OTU sets built by TreeOTU are more faithful to phylogenetic tree structures than sequence clustering (non phylogenetic) methods for ssu-rRNAs. OTUs built from phylogenetic trees of protein coding PhyEco markers are comparable to our current taxonomic classification at different levels. With the included OTU comparing tools, the TreeOTU is robust in phylogenetic referencing with different phylogenetic markers and trees. The TreeOTU package includes OTU classification, comparison and tree rooting scripts, as well as the alignments, trees and NCBI/IMG taxonomic classification information related to this research. The contents in he package are described in the file README.txt.

目前我们对细胞生物(尤其是细菌与古菌)的分类学与系统发育多样性的认知,大多基于小亚基核糖体RNA(small-subunit rRNAs, ssu-rRNAs)序列的相关研究。为解决ssu-rRNA作为系统发育标记的局限性——例如不同生物间的拷贝数变异(copy number variation),以及水平基因转移(horizontal gene transfer)、趋同演化(convergent evolution)或进化速率差异所带来的分析复杂性——我们将蛋白质编码基因家族鉴定为替代的系统发育与系统发育生态学标记(Phylogenetic and Phylogenetic Ecology markers, PhyEco)。 当前基于核苷酸序列相似性的操作分类单元(Operational Taxonomic Unit, OTU)分类方法,难以直接适用于PhyEco标记的氨基酸序列。本文报道了TreeOTU的开发:这是一种基于系统发育树结构的OTU分类方法,可兼顾不同类群与不同基因间的进化速率差异。相较于针对ssu-rRNA的序列聚类(非系统发育)方法,TreeOTU构建的OTU集合更贴合系统发育树结构。基于蛋白质编码PhyEco标记的系统发育树所得到的OTU,在不同分类层级上与当前主流的分类学分类结果具有可比性。 搭配内置的OTU比较工具,TreeOTU在使用不同系统发育标记与系统发育树进行系统发育参照时表现出良好的鲁棒性。TreeOTU工具包包含OTU分类、比较与系统发育树根化脚本,以及本研究相关的序列比对结果、系统发育树文件与NCBI/IMG分类学分类信息。该工具包的具体内容可参考README.txt文件。
提供机构:
Dongying Wu
创建时间:
2014-03-12
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作