Updating splits, lumps, and shuffles: Reconciling GenBank names with standardized avian taxonomies
收藏DataCite Commons2025-06-01 更新2025-05-10 收录
下载链接:
https://datadryad.org/dataset/doi:10.5061/dryad.gtht76hqf
下载链接
链接失效反馈官方服务:
资源简介:
Abstract Biodiversity research has advanced by testing expectations of
ecological and evolutionary hypotheses through the linking of large-scale
genetic, distributional, and trait datasets. The rise of molecular
systematics over the past 30 years has resulted in a wealth of DNA
sequences from around the globe. Yet, advances in molecular systematics
also have created taxonomic instability, as new estimates of evolutionary
relationships and interpretations of species limits have required
widespread scientific name changes. Taxonomic instability, colloquially
“splits, lumps, and shuffles,” presents logistical challenges to
large-scale biodiversity research because (1) the same species or sets of
populations may be listed under different names in different data sources,
or (2) the same name may apply to different sets of populations
representing different taxonomic concepts. Consequently, distributional
and trait data are often difficult to link directly to primary DNA
sequence data without extensive and time-consuming curation. Here, we
present RANT: Reconciliation of Avian NCBI Taxonomy. RANT applies
taxonomic reconciliation to standardize avian taxon names in use in NCBI
GenBank, a primary source of genetic data, to a widely used and regularly
updated avian taxonomy: eBird/Clements. Of 14,341 avian species/subspecies
names in GenBank, 11,031 directly matched an eBird/Clements; these link to
more than 6 million nucleotide sequences. For the remaining unmatched
avian names in GenBank, we used Avibase’s system of taxonomic concepts,
taxonomic descriptions in Cornell’s Birds of the World, and DNA sequence
metadata to identify corresponding eBird/Clements names. Reconciled names
linked to more than 600,000 nucleotide sequences, ~9% of all avian
sequences on GenBank. Nearly 10% of eBird/Clements names had nucleotide
sequences listed under 2 or more GenBank names. Our taxonomic
reconciliation is a first step towards rigorous and open-source curation
of avian GenBank sequences and is available at GitHub, where it can be
updated to correspond to future annual eBird/Clements taxonomic updates.
提供机构:
Dryad
创建时间:
2022-08-25



