List of genomes used a reference for the functional annotation of proteins in new genomes
收藏DataCite Commons2020-10-20 更新2024-07-28 收录
下载链接:
https://figshare.com/articles/dataset/List_of_genomes_used_a_reference_for_the_functional_annotation_of_proteins_in_new_genomes/13118405
下载链接
链接失效反馈官方服务:
资源简介:
We used Prokka (version 1.12) (Seemann 2014) to annotate the new genomes sequences, with the following notable options: “--force --addgenes --locustag --compliant –usegenus –genus Rhizobium --kingdom Bacteria --gcode 11”. We used a reference database of annotated protein based on proteomes predicted from all available complete genomes from the <i>Rhizobium</i>/<i>Agrobacterium</i> group (n=206), as downloaded from the NCBI RefSeq Assembly database on the 15 Dec 2017 using the query ‘txid227290[Organism:exp] AND ("latest refseq"[filter] AND ("scaffold level"[filter] OR "chromosome level"[filter] OR "complete genome"[filter]) AND all[filter] NOT anomalous[filter])’. These proteomes were merged into a non-redundant dataset that was thinned by retaining a single representative protein for each protein cluster as determined by the CD-HIT clustering program (version 4.6) (Fu et al. 2012), used with the options ‘-T 0 -M 0 -G 1 -s 0.8 -c 0.9’. The resulting dataset was used as a subject database for a BLASTP (Camacho et al. 2009) similarity search during the annotation process.
提供机构:
figshare
创建时间:
2020-10-20



