Updating splits, lumps, and shuffles: Reconciling GenBank names with standardized avian taxonomies
收藏Mendeley Data2024-04-13 更新2024-06-29 收录
下载链接:
https://datadryad.org/stash/dataset/doi:10.5061/dryad.gtht76hqf
下载链接
链接失效反馈官方服务:
资源简介:
D1:"PetersVsClements2Final.txt" - This file tells which species from the Peters taxonomy match the 2019 Clements/ebird taxonomy. The first column has a species name from the Peters taxonomy. In the second column, "Clements" indicates that the species name matches the Clements/ebird taxonomy, "No" means it does match, and "Close" means that the names match when you disregard the last two letters."SibleyMonroeVsClements_Final.txt" - This file tells which species from the Sibley Monroe taxonomy match the 2019 Clements/ebird taxonomy. The first column has a species ID number from the Sibley Monroe taxonomy. The second column has the species scientific name from the Sibley Monroe taxonomy. The third column has the common name from the Sibley Monroe taxonomy. In the fourth column, "Clements" indicates that the species name matches the Clements/ebird taxonomy, "No" means it does match, and "Close" means that the names match when you disregard the last two letters.D2:"taxonomy_result.unix.xml" - XML file with NCBI taxonomy with the names descending from "Aves" (downloaded May 3, 2020). "GenBank.AvesSpecies.txt" - This text file has the GenBank species and subspecies names within "Aves". The first column has the GenBank taxon ID number. The second column has the scientific name corresponding to the taxon ID number, and the third column lists whether this name corresponds to a species or a subspecies."extractGBnames.pl" - Perl script that reads in "taxonomy_result.unix.xml" and outputs the taxon ID numbers, their corresponding scientific names, and ther rank(e.g. "species").D3:"compare.pl" - Perl script that reads in the Clements Ebird 2019 taxonomy (the file "EbirdClements.txt" in D4) and the list of GenBank taxon names from Aves (the file "GenBank.AvesSpecies.txt" in S2). If the GenBank taxon name exactly matches a name in the Clements/Ebird taxonomy, it outputs the GenBank taxon ID, GenBank name, GenBank rank, and all the information associated with the name in the Clements/Ebird taxonomy. If the GenBank name does not match, it just outputs the GenBank taxon ID, GenBank name, and GenBank rank.D4:"EbirdClements.txt" - text file with the taxonomic names and associated metadata from the 2019 Ebird Clements dataset. The first column has the code associated with species names; the second column has the rank, the third column has the common name for the species, the fourth column has the scientific name; the fifth column has the range; the sixth column has the order name, and the last column has the family name (with the common family name next to it in parentheses).D5:"nucl_gb.accession2taxid.gz" - compressed file that has the GenBank accession numbers from the core nucleotide database and the taxon ID associated with that sequence. This was downloaded from NCBI on Novemeber 2, 2020.D6: "taxonomy_result.txt" - text file with a list of GenBank taxon ID numbers associated with species and subspecies within Aves."countgb.pl" - Perl script that reads in "taxonomy_result.txt" and "nucl_gb.accession2taxid" (from S5) and outputs the number of GenBank sequences associated with each avian species or subspecies ID number.D7:"SraResultInfo.csv" - CSV file that summarizes the data from each run in the NCBI SRA database associated with an Aves taxon. The 28th column has the GenBank taxon ID associatetd with the SRA run. This information was downloaded from NCBI on August 1, 2021.D8:"genome_result.txt" - text file with a summary of the genome files in NCBI associated with taxa within Aves. This file was downloaded on September 5, 2021. The taxon name is next to the number of each entry."getnames.pl" - Perl script that reads in "genome_result.txt" and outputs a list of avian taxa with genome files and the number of genome files associated with each taxon.D9:"MacaulayLibrary_MediaSummary_April_2021.csv" – CSV file summarizing Macaulay Library audio recordings and GenBank nucleotide sequences associated with eBird/Clements 2019 names (downloaded April 2021)D10:"Xeno-canto_MediaSummary_October2020.csv" - CSV file summarizing Xeno-canto audio recordings and GenBank nucleotide sequences associated with eBird/Clements 2019 names (downloaded October 2020)D11:"GenBank_eBird/Clements2019_taxonomic_reconciliation_12Nov2021.csv" - CSV file reconciling GenBank TaxIDs with eBird/Clements 2019 taxonomy D12:"TaxonomicReconciliation_IUCNstatus.csv" – CSV file reconciling GenBank TaxIDs with eBird/Clements 2019 taxonomy, with respect to IUCN statusD13:"Taxonomic_reconciliation_related_to_geographic_realm.csv — CSV file with reconciliation status related to geographic realmsD14:"TaxonomicReconciliation_Xeno-canto.csv" – CSV file reconciling eBird/Clements 2019 taxonomy, with Xeno-canto, which uses IOC taxonomy
创建时间:
2023-06-28



