five

OrthoMCL

收藏
re3data.org2024-05-31 收录
下载链接:
https://www.re3data.org/repository/r3d100012462
下载链接
链接失效反馈
官方服务:
资源简介:
OrthoMCL is a genome-scale algorithm for grouping orthologous protein sequences. It provides not only groups shared by two or more species/genomes, but also groups representing species-specific gene expansion families. So it serves as an important utility for automated eukaryotic genome annotation. OrthoMCL starts with reciprocal best hits within each genome as potential in-paralog/recent paralog pairs and reciprocal best hits across any two genomes as potential ortholog pairs. Related proteins are interlinked in a similarity graph. Then MCL (Markov Clustering algorithm,Van Dongen 2000; www.micans.org/mcl) is invoked to split mega-clusters. This process is analogous to the manual review in COG construction. MCL clustering is based on weights between each pair of proteins, so to correct for differences in evolutionary distance the weights are normalized before running MCL.

OrthoMCL是一种用于对正同源蛋白序列进行分组的基因组规模算法。它不仅提供了两个或更多物种/基因组共有的分组,还提供了代表物种特异性基因扩张家族的分组。因此,它作为自动化真核基因组注释的重要工具。OrthoMCL从每个基因组内的互为最佳匹配开始,作为潜在的近源同源对/近期近源同源对,以及任何两个基因组之间的互为最佳匹配作为潜在的直系同源对。相关的蛋白质通过相似性图相互连接。然后调用MCL(马尔可夫聚类算法,Van Dongen 2000;www.micans.org/mcl)来分割巨型聚类。此过程类似于COG构建中的手动审查。MCL聚类基于每对蛋白质之间的权重,因此为了纠正进化距离的差异,在运行MCL之前对权重进行归一化。
提供机构:
Ortholog Groups of Protein Sequences
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作