Reference Sequence Library Resources - Maine-eDNA
收藏NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://zenodo.org/record/7942246
下载链接
链接失效反馈官方服务:
资源简介:
The following files and resources are associated with the Maine-eDNA Reference Library Research Group - aiming to create reference sequence library resources for researchers part of Maine-eDNA or otherwise interested in leveraging eDNA tools for research in the Gulf of Maine.
These include:
RoughWorkflow.zip
contains a NCBI scraping script to build reference databases based on an input species list, a configuration file for the script, and genbankr version - most useful for shorter species lists (time-intensive)
12S_REFDB.fasta
A DADA2-compliant reference library for 12S sequences, built with the RoughWorkflow based on the GNRMaineSpecies_May2024 species list
COI_REFDB.fasta
A DADA2-compliant reference library for COI sequences, built with the RoughWorkflow based on the GNRMaineSpecies_May2024 species list
GitHub Repo - referee - https://github.com/BigelowLab/referee
Scripts for building reference databases for the Maine-eDNA project through downloading GenBank - this workflow is suggested especially for large species lists
GitHub Repo - refdbtools - https://github.com/BigelowLab/refdbtools
R language package to assist in making eDNA reference databases
SpeciesListMeta_Shareable.xlsx
Describes the sources from which the GNRMaineSpecies_May2024.csv and GNRMaineTaxonomiedSpecies_May2024.csv species lists were compiled - sources not associated with a link were found as separate files and are hosted elsewhere. Species list sources are courtesy of public datasets, Maine-eDNA researchers, and collaborators
GNRMaineSpecies_May2024.csv
A Maine (and surrounding area) species list ran through taxize’s gnr_resolve to resolve species names and fill out taxonomy (full results)
GNRMaineTaxonomiedSpecies_May2024.csv
A Maine (and surrounding area) species list ran through taxize’s gnr_resolve to resolve species names and fill out taxonomy (only species results that could be resolved with taxonomy)
Contact Beth Y. Davis - bethy.davis4@gmail.com for questions
###
Changelog:
May 16, 2023 (Version 1.0) - Initial upload
July 11, 2023 (Version 1.1) - Did additional cleaning to the MaineSpeciesList_Clean file and uploaded the new version - July2023_SpeciesList
May 20, 2024 (Version v3) - Additional cleaning to correct deduplication errors and ran taxize's gnr_resolve to resolve names and fill out taxonomy. The version update adds two files, GNRMaineSpecies_May2024.csv containing the full result of gnr_resolve, and GNRMaineTaxonomiedSpecies_May2024.csv contains only those species from the original list that could be resolved with taxonomy. The SpeciesListMeta_Shareable.csv has not been updated but is still an accurate tracker of the sources from which the species names were gained from.
November 27, 2024 (Version 4.0) - Updated Zenodo description and added the RoughWorkflow R files, 12S and COI files
本数据集关联于缅因eDNA参考库研究组(Maine-eDNA Reference Library Research Group),旨在为参与缅因eDNA项目的研究人员,以及其他有意借助环境DNA(eDNA)工具开展缅因湾研究的人员提供参考序列库资源。
包含的资源如下:
1. **RoughWorkflow.zip**:包含基于输入物种列表构建参考数据库的美国国家生物技术信息中心(NCBI)爬取脚本、该脚本的配置文件,以及genbankr版本——该工具对于较短的物种列表较为实用(处理长物种列表耗时较长)。
2. **12S_REFDB.fasta**:一款符合DADA2规范的12S序列参考库,基于GNRMaineSpecies_May2024物种列表,通过RoughWorkflow构建而成。
3. **COI_REFDB.fasta**:一款符合DADA2规范的COI序列参考库,基于GNRMaineSpecies_May2024物种列表,通过RoughWorkflow构建而成。
4. **GitHub仓库 - referee**:链接为https://github.com/BigelowLab/referee,包含为缅因eDNA项目构建参考数据库的脚本,通过下载GenBank数据实现,该流程尤其推荐用于大型物种列表。
5. **GitHub仓库 - refdbtools**:链接为https://github.com/BigelowLab/refdbtools,是一款用于辅助构建eDNA参考数据库的R语言包。
6. **SpeciesListMeta_Shareable.xlsx**:说明了GNRMaineSpecies_May2024.csv与GNRMaineTaxonomiedSpecies_May2024.csv物种列表的来源,未附带链接的来源可作为独立文件在其他平台获取。本物种列表的来源均来自公开数据集、缅因eDNA研究人员及合作者。
7. **GNRMaineSpecies_May2024.csv**:一份缅因州及周边区域的物种列表,通过taxize的gnr_resolve工具处理以解析物种名称并补全分类学信息(包含全部解析结果)。
8. **GNRMaineTaxonomiedSpecies_May2024.csv**:一份缅因州及周边区域的物种列表,通过taxize的gnr_resolve工具处理以解析物种名称并补全分类学信息(仅保留可完成分类学解析的物种结果)。
如有疑问,请联系Beth Y. Davis,邮箱为bethy.davis4@gmail.com。
### 更新日志
- 2023年5月16日(版本1.0):首次上传
- 2023年7月11日(版本1.1):对MaineSpeciesList_Clean文件进行额外清理,并上传新版本July2023_SpeciesList
- 2024年5月20日(版本v3):进行额外清理以修正去重错误,并运行taxize的gnr_resolve工具解析名称、补全分类学信息。本次版本更新新增两个文件:GNRMaineSpecies_May2024.csv包含gnr_resolve的全部结果,GNRMaineTaxonomiedSpecies_May2024.csv仅保留原列表中可完成分类学解析的物种。SpeciesListMeta_Shareable.csv未进行更新,但仍可准确追踪物种名称的来源渠道。
- 2024年11月27日(版本4.0):更新Zenodo描述,并新增RoughWorkflow相关R文件、12S与COI参考库文件。
创建时间:
2024-11-28



