Aggregated Beaufort Sea benthic infauna data from the National Oceanographic Data Center (NODC), 1971-1980
收藏DataONE2021-03-09 更新2024-06-08 收录
下载链接:
https://search.dataone.org/view/10.24431_rw1k57r_20210309T194253Z
下载链接
链接失效反馈官方服务:
资源简介:
These data were originally collected in the 1970s and early 1980s, and archived at NODC in a text format whose column-based structure varies depending on the data record type represented by a given line of text. These text files were parsed using Python code which splits the data into separate files according to record type, and stores the data in comma-separated values format. Inputs to the Python code include the original data file, CSV files with information on how to parse each record type within the data file, and any lookups required to interpret the data, such as transforming an equipment code of "8" into "EKMAN GRAB". The CSV files with information on how to parse each record type were created by referencing parsing instructions provided by NCEI. If a given record type is not included in the actual data, then no output files for that record type are created. This project includes a readme file, original data files from prior investigators, code lookups, CSV files of parsing instructions, optional files created by splitting original data files into separate files by record type, output CSV files created by parsing original data files into separate files by record type, and Python scripts to perform the parsing. The output CSV files represent the dataset produced from this work. Parsing instructions for original data files as well as data codes can be found at https://www.nodc.noaa.gov/access/dataformats.html. Taxon identifiers from the Integrated Taxonomic Information System can be included in the output by the parsing code; full taxonomic information for these identifiers can be retrieved from the ITIS website, https://itis.gov/.
本数据集的原始采集时间为20世纪70年代至80年代初,归档于美国国家海洋数据中心(National Oceanographic Data Center, NODC),存储格式为文本文件;其基于列的结构会根据单行文本所表征的数据记录类型而存在差异。本项目通过Python代码对上述文本文件进行解析:代码将依据记录类型将原始数据拆分为多个独立文件,并将数据存储为逗号分隔值(Comma-Separated Values, CSV)格式。该Python代码的输入内容包括原始数据文件、用于说明数据文件内各记录类型解析规则的CSV文件,以及解读数据所需的各类对照表;例如将设备代码“8”转换为“EKMAN GRAB”。上述用于说明记录类型解析规则的CSV文件,参考了美国国家环境信息中心(National Centers for Environmental Information, NCEI)提供的解析指南编制而成。若原始数据中未包含某一记录类型,则不会生成对应该类型的输出文件。本项目包含以下组成部分:README文档、过往研究者提供的原始数据文件、代码对照表、解析规则CSV文件、可选拆分文件(按记录类型将原始数据拆分为独立文件后生成)、通过解析原始数据并按记录类型拆分得到的输出CSV文件,以及用于执行解析操作的Python脚本。上述输出CSV文件即为本次项目生成的最终数据集。原始数据文件的解析指南及数据代码可通过以下网址获取:https://www.nodc.noaa.gov/access/dataformats.html。解析代码可将来自综合分类学信息系统(Integrated Taxonomic Information System, ITIS)的分类单元标识符纳入输出结果;此类标识符的完整分类学信息可通过ITIS官方网站https://itis.gov/查询获取。
创建时间:
2021-03-09



