five

CO-ARBitrator Rev2 README

收藏
Figshare2019-07-08 更新2026-04-08 收录
下载链接:
https://figshare.com/articles/CO-ARBitrator_Rev2_README/8283539/1
下载链接
链接失效反馈
官方服务:
资源简介:
17 June 2019: Revision 2 of the CO-ARBitrator data set is available. It is based on a March 21 2019 snapshot of GenBank, and contains 1,286,434 records of which 249,002 are new since the original revision.<br>The release consists of 4 files: a nucleotide fasta, an incremental nucleotide fasta, an amino acid fasta, and a TSV table. In the TSV table, each row describes a sequence and provides nucleotide and amino acid accession #s and sequences as well as taxonomy.<br><br>The defline format of the fasta files has been changed since the original release, hopefully for the better! In rev 1, deflines contained taxonomy copied verbatim from sequence record pages. These taxonomies are lists of values without rank identifiers; intermediate ranks (e.g. suborder or infraorder) may or may not be present. <br>In rev 2, wherever possible, the taxonomy in the fasta deflines has been retrieved from the NCBI taxonomy browser. Where taxonomy is not available, it is deduced from the "organism" field of the sequence's NSBI record.<br>Example defline:<br> p.p1 {margin: 0.0px 0.0px 0.0px 0.0px; font: 11.0px Menlo; color: #000000; background-color: #ffffff} span.s1 {font-variant-ligatures: no-common-ligatures} &gt;AAG37251|Litoria_nannotis|K_Metazoa__P_Chordata__C_Amphibia__O_Anura__F_Hylidae__G_Litoria__S_nannotis<br>The format is:&gt;Accession|Binomial|TaxonomyTaxonomic ranks are labeled by the first letter of the rank. In the example above, the kingdom (K) is Metazoa, the phylum (P) is Chordata, etc.<br><br>Please address questions, comments, or requests to Phil Heller at philip.heller@sjsu.edu.
创建时间:
2019-06-18
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作