five

COins database

收藏
Figshare2024-08-29 更新2026-04-08 收录
下载链接:
https://figshare.com/articles/dataset/COins_database/19130465/4
下载链接
链接失效反馈
官方服务:
资源简介:
COins is a database of COI-5P sequences of insects that includes over 532,000 representative sequences of more than 106,000 species specifically formatted for the QIIME2 software platform. It was developed through a combination of automated and manually curated steps, starting from insects COI sequences available in the Barcode of Life Data System selecting sequences that comply to several standards, including a species-level identification.<br><br><br><br><b>seq-degapped.qza</b> --&gt; reference sequences<br><b>taxonomy.qza</b> --&gt; sequences taxonomy<br><b>SklearnClassifier_COins_QIIME2_v2024.5.qza (NEW!)</b> --&gt; naïve Bayes taxonomic classifier trained on CO<i>ins</i> (QIIME2 version 2024.5)<br><b>SklearnClassifier_COins_QIIME2_v2023.5.qza</b> --&gt; naïve Bayes taxonomic classifier trained on CO<i>ins</i> (QIIME2 version 2023.5)<br><b>SklearnClassifier_COins_QIIME2_v2022.2.qza</b> --&gt; naïve Bayes taxonomic classifier trained on CO<i>ins</i> (QIIME2 version 2022.2)<br><b>Sequences_metadata1.tsv</b> --&gt; Identification procedure of voucher specimens from which reference sequences were developed.Identification procedure is reported for each sequence included in CO<i>ins</i> (BOLD id reported in <i>BOLDid reference</i> column) and for all identical sequences within haplotypes that were removed at Step 5 of CO<i>ins</i> curation (those for which BOLD id is not available in <i>BOLDid reference </i>column). The haplotype to which each sequence belongs is reported in <i>Haplotype</i> column (haplotypes of each species are labeled with increasing numbers). Identification procedure information derived from sequences associated metadata provided by BOLD system.<br><b>Sequences_metadata2.tsv </b>--&gt;Identical sequences belonging to different species present within CO<i>ins</i>.Each row represents a cluster of identical sequences associated to different species, sequences included in the cluster are labeled with species name and BOLD id.<br>
提供机构:
Magoga, Giulia
创建时间:
2024-08-29
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作