five

ALIGNED_BEST_FIRST_1500_ORTHOLOGS.tar

收藏
DataONE2015-05-21 更新2024-06-27 收录
下载链接:
https://search.dataone.org/view/null
下载链接
链接失效反馈
官方服务:
资源简介:
ALIGNED_BEST_FIRST_15000_ORTHOLOGS folder: “Best” ortholog amino acid and nucleotide alignments. The 104,235 putative orthologs described in ALIGNED_ALL-ORTHOLOGS often contained more than two representative sequences per species. For the first 15,000 putative orthologs (those with the most species included in the alignments) we used UCLUST to find the best representative per species per ortholog by taking the sequence that was closest to the centroid for that ortholog. Alignments contain one representative per species per alignment (found by centroid clustering explained in the methods) are given for Orthologs 1-15,000, regardless of how many species are contained in the centroid. After designating one representative sequence per species, alignments were performed as described in the methods (e.g. MSAProbs followed by TranslatorX). Amino acid and nucleotides are given in their respective folders. - Also included in each folder is an ortholog key file (Ortholog_Key_3_20_2014.xlsx) that contains a list of each putative ortholog clustered by OrthoMCL, the best blast hit to uniprot, the number of species that were included in the ortholog, and the centroid IDs of those species (corresponding the the transcriptome assembly .fasta file). OrthoMCL was ran on 74 total species, and we excluded form our alignments 8 species with very poor representation (for a maximum of 66 species contained within the alignments analyzed for the paper).
创建时间:
2015-05-21
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作