ALIGNED_BEST_FIRST_1500_ORTHOLOGS.tar
收藏DataONE2015-05-21 更新2024-06-27 收录
下载链接:
https://search.dataone.org/view/null
下载链接
链接失效反馈官方服务:
资源简介:
ALIGNED_BEST_FIRST_15000_ORTHOLOGS folder: “Best” ortholog amino acid and nucleotide alignments. The 104,235 putative orthologs described in ALIGNED_ALL-ORTHOLOGS often contained more than two representative sequences per species. For the first 15,000 putative orthologs (those with the most species included in the alignments) we used UCLUST to find the best representative per species per ortholog by taking the sequence that was closest to the centroid for that ortholog. Alignments contain one representative per species per alignment (found by centroid clustering explained in the methods) are given for Orthologs 1-15,000, regardless of how many species are contained in the centroid. After designating one representative sequence per species, alignments were performed as described in the methods (e.g. MSAProbs followed by TranslatorX). Amino acid and nucleotides are given in their respective folders.
- Also included in each folder is an ortholog key file (Ortholog_Key_3_20_2014.xlsx) that contains a list of each putative ortholog clustered by OrthoMCL, the best blast hit to uniprot, the number of species that were included in the ortholog, and the centroid IDs of those species (corresponding the the transcriptome assembly .fasta file). OrthoMCL was ran on 74 total species, and we excluded form our alignments 8 species with very poor representation (for a maximum of 66 species contained within the alignments analyzed for the paper).
创建时间:
2015-05-21



