PHYTOPK28-D1D2: A curated database of 28S rRNA gene D1-D2 domains from eukaryotic organisms dedicated to metabarcoding analyses of marine phytoplankton samples

NIAID Data Ecosystem2026-03-10 收录

下载链接：

https://data.mendeley.com/datasets/mndb4h87yg

下载链接

链接失效反馈

官方服务：

资源简介：

The PHYTOPK28-D1D2 database comprises accession numbers, taxonomic classification and 28S rDNA (D1-D2 domains) sequences that are available in public DNA databases. The sequences, listed in FASTA format, are identified by the accession number and the hierarchical taxonomy information. The PHYTOPK28-D1D2 database was built for the taxonomic annotation of DNA metabarcodes generated from water samples collected in six French Mediterranean lagoons, once a month between May and September/October 2012, and fractionated by size (three size ranges: 0.7-5 µm, 5-20 µm and 20-100 µm). This metabarcode dataset was deposited in the European Nucleotide Archive under the accession number PRJEB18757. The PHYTOPK28-D1D2 database was started with an initial dataset that was retrieved on the April 19, 2013 from the ribosomal DNA database SILVA. Further sequences were added by extensive BLAST searches in the NCBI/GenBank nucleotide database by targeting the main taxonomic divisions among eukaryotic, marine or freshwater, algal and plankton lineages, and excluding environmental sequences. The hereby first version of the database assembled by the end of June 2015, PHYTOPK28-D1D2_v1, reached 8,753 reference sequences, including more than 3,600 from algal/phytoplanktonic lineages (Chlorophyta, Cryptophyta, Dinophyceae, Haptophyceae, Stramenopiles, Rhodophyta, Euglenozoa, Rhizaria, Glaucocystophyceae) and ~700 from microzooplankton (including ciliates, rotifers, copepods) when it was used for computing the annotation of the metabarcode library. It is not claimed that this PHYTOPK28-D1D2 database is exhaustive with respect to its purpose. It is not warranted that the database does not contain overlooked identification errors from undetected errors originating from the deposition in public databases or from missed literature reporting taxonomic changes. The database can also lack recently released data at the time of use in June 2015. It is intended to further enrich the database by adding new–mostly recently released–sequence accessions and to make a new database version available from time to time. Anyone interested in receiving a recently updated database can contact the first author (DG). Any information reporting errors, omissions or recently released sequences would also be welcome to help in this updating effort. It would be interesting to make this database become richer by adding more information on reference sequences, for example by linking the accession numbers to GenBank database information, by adding and linking to the article reference related to the sequence submission (an information that is not always updated in the public DNA databases) and eventually, the subsequent literature references leading to changes in the taxonomic name or in the classification of organisms.

创建时间：

2017-07-18