five

List of genomes used a reference for the functional annotation of proteins in new genomes

收藏
DataCite Commons2020-10-20 更新2024-07-28 收录
下载链接:
https://figshare.com/articles/dataset/List_of_genomes_used_a_reference_for_the_functional_annotation_of_proteins_in_new_genomes/13118405
下载链接
链接失效反馈
官方服务:
资源简介:
We used Prokka (version 1.12) (Seemann 2014) to annotate the new genomes sequences, with the following notable options: “--force --addgenes --locustag --compliant –usegenus –genus Rhizobium --kingdom Bacteria --gcode 11”. We used a reference database of annotated protein based on proteomes predicted from all available complete genomes from the <i>Rhizobium</i>/<i>Agrobacterium</i> group (n=206), as downloaded from the NCBI RefSeq Assembly database on the 15 Dec 2017 using the query ‘txid227290[Organism:exp] AND ("latest refseq"[filter] AND ("scaffold level"[filter] OR "chromosome level"[filter] OR "complete genome"[filter]) AND all[filter] NOT anomalous[filter])’. These proteomes were merged into a non-redundant dataset that was thinned by retaining a single representative protein for each protein cluster as determined by the CD-HIT clustering program (version 4.6) (Fu et al. 2012), used with the options ‘-T 0 -M 0 -G 1 -s 0.8 -c 0.9’. The resulting dataset was used as a subject database for a BLASTP (Camacho et al. 2009) similarity search during the annotation process.
提供机构:
figshare
创建时间:
2020-10-20
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作