Fast.genomics comparative genome browser: 2023 release
收藏DataCite Commons2023-08-23 更新2024-08-26 收录
下载链接:
https://figshare.com/articles/dataset/Fast_genomics_comparative_genome_browser_2023_release/24010353
下载链接
链接失效反馈官方服务:
资源简介:
This is an archive of the 2023 release of http://fast.genomics.lbl.gov, a fast comparative genome browser for diverse bacteria and archaea.fast_code_Aug2023.tar.gz contains the source code, which is also available at github. See SETUP for installation instructions and lib/neighbor.sql for the database schema.Some of the code depends on Perl libraries from the PaperBLAST code base. These are archived in PaperBLAST_lib.tar.gz and should be exploded into ../PaperBLAST/ to create the ../PaperBLAST/lib directory. They are also available from github.fast_main_May2023.tar.gz contains the main database, with one representative genome for each of 6,377 genera. neighbor.db is the SQLite3 database and neighbor.faa.gz has the protein sequences. Put these files in the data/ directory. You can build the mmseqs database with:gunzip data/neighbor.faa.gz<br>mmseqs createdb data/neighbor.faa data/mmseqsdb --dbtype 1<br>mmseqs createindex data/mmseqsdb /tmp -k 6The sub-databases, with additional genomes for each taxonomic order, can be downloaded here or here. The tarball contains a directory for each sub-database; these subdirectories should go in the data/ directory. After installing the main database, you can build the clustered BLAST+ database for each sub-database's cluster with:for sub in `sqlite3 data/neighbor.db 'select prefix FROM SubDb;'`;<br>do gunzip data/$sub/cluster.faa.gz<br>makeblastdb -in data/$sub/cluster.faa -dbtype prot -out data/$sub/cluster.faa.plusdb<br>doneOr, to download the the sub-database for a specific order, visit the fast.genomics web site, search for that order, switch to the sub-database, and see downloads section at the bottom of the main page.
本归档文件对应http://fast.genomics.lbl.gov于2023年发布的版本,该站点是一款面向各类细菌与古菌的快速比较基因组浏览器(comparative genome browser)。
fast_code_Aug2023.tar.gz 包含源代码,该代码亦可从GitHub获取。安装说明详见SETUP文件,数据库架构请参见lib/neighbor.sql。
部分代码依赖PaperBLAST代码库中的Perl库,此类库归档于PaperBLAST_lib.tar.gz,需将其解压至../PaperBLAST/目录以创建../PaperBLAST/lib目录,该类库同样可从GitHub获取。
fast_main_May2023.tar.gz 包含主数据库,该数据库收录了6377个属各1个代表性基因组。其中neighbor.db为SQLite3数据库文件,neighbor.faa.gz存储蛋白质序列,请将上述文件放置于data/目录下。
可通过以下命令构建MMseqs数据库:
gunzip data/neighbor.faa.gz
mmseqs createdb data/neighbor.faa data/mmseqsdb --dbtype 1
mmseqs createindex data/mmseqsdb /tmp -k 6
针对每个分类学目的额外基因组的子数据库,可通过此处或此处下载。该压缩包内含各子数据库对应的目录,需将这些子目录放置于data/目录中。
完成主数据库安装后,可通过以下命令为每个子数据库的聚类结果构建BLAST+数据库:
for sub in `sqlite3 data/neighbor.db 'select prefix FROM SubDb;'`;
do gunzip data/$sub/cluster.faa.gz
makeblastdb -in data/$sub/cluster.faa -dbtype prot -out data/$sub/cluster.faa.plusdb
done
或者,若需下载特定分类学目的子数据库,可访问fast.genomics官网,搜索对应目并切换至子数据库页面,查看主页底部的下载板块。
提供机构:
figshare
创建时间:
2023-08-23



