GMPipe: A gene mining pipeline for identifying olfactory receptor repertoires and other multicopy gene families from genomes
收藏NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://zenodo.org/record/10614214
下载链接
链接失效反馈官方服务:
资源简介:
This repository contains the code and tutorial for the GMPipe algorithm. GMPipe is a gene mining pipeline designed to maximize the identification of multicopy gene family homologs in genome assemblies (for intronless genes) or lists of protein sequences. GMPipe has been validated for mining olfactory receptor genes from vertebrate genomes. This algorithm uses custom ingroup and outgroup sequence lists to distinguish between members of a gene family of interest from close off-target relatives. Overall, GMPipe uses HMMER as an initial screening step, paired with phylogenetic clustering to test whether passing sequences are more closly related to "ingroup" than "outgroup" sequences. The result is a pipeline with a low false negative rate, with minimal impact on the false positive rate, especially when paired with additional family-specific filtering steps.
创建时间:
2024-08-20



