five

Curated sequence database "Life-OF-Mick" first used in Cornet, Magain et al. Exploring syntenic conservation across genomes for phylogenetic studies of organisms subjected to horizontal gene transfers: a case study with Cyanobacteria.

收藏
DataCite Commons2021-01-10 更新2024-07-28 收录
下载链接:
https://figshare.com/articles/dataset/Curated_sequence_database_Life-OF-Mick_first_used_in_Cornet_Magain_et_al_Exploring_syntenic_conservation_across_genomes_for_phylogenetic_studies_of_organisms_subjected_to_horizontal_gene_transfers_a_case_study_with_Cyanobacteria_/13550267/1
下载链接
链接失效反馈
官方服务:
资源简介:
This archive is a database of 272 complete proteomes broadly sampled from the three domains of Life. Used in various projects of the Unit of Eukaryotic Phylogenomics (ULiège, Belgium), it is internally referred to as "Life-OF-Mick".<br>The main file is a FASTA file containing 2,176,699 protein sequences: <b>lifeofmick.fasta.gz</b>.<br><br>These correspond to the full conceptual translations of:- 50 archaeal genomes- 115 bacterial genomes- 107 eukaryotic genomes<br>The lineages (following NCBI Taxonomy as of Sep 2016) of the 272 organisms are given in the file <b>lifeofmick.tsv</b>.<br>Archaeal and bacterial proteomes were sampled from Ensembl Bacteria (release 30), whereas eukaryotic proteomes were hand-picked from various public genome portals (i.e., Ensembl Protists, EuPathDB, JGI, NCBI, pico-PLAZA) and institutional websites. Details of the eukaryotic sources (including download links) are available in the file <b>lifeofmick-euka-src.tsv</b>, while archaeal and bacterial download links are provided in the files <b>lifeofmick-arch-src.tsv</b> and <b>lifeofmick-bact-src.tsv</b>, respectively.<br>Eukaryotic proteomes were completed by organellar sequence data (mitochondrial and plastid genomes) fetched from the NCBI databases. A table of these complementations can be found in the file <b>lifeofmick-euka-org-compl.tsv</b>.<br>Finally, eukaryotic proteomes were dereplicated using CD-HIT and a global identity threshold of 99%.<br>Be careful that the "complete proteomes" of <i>Cryptomonas paramecium</i>, <i>Chroomonas mesostigmatica</i>, <i>Lotharella oceanica</i> and <i>Paulinella chromatophora</i> actually correspond to the "plastid" genomes only.
提供机构:
figshare
创建时间:
2021-01-10
二维码
社区交流群
二维码
科研交流群
商业服务