five

Aggregated Beaufort Sea benthic infauna data from the National Oceanographic Data Center (NODC), 1971-1980

收藏
DataONE2021-04-29 更新2024-06-08 收录
下载链接:
https://search.dataone.org/view/10.24431_rw1k57y_20210429T184836Z
下载链接
链接失效反馈
官方服务:
资源简介:
These data were originally collected in the 1970s and early 1980s, and archived at NODC in a text format whose column-based structure varies depending on the data record type represented by a given line of text. These text files were parsed using Python code which splits the data into separate files according to record type, and stores the data in comma-separated values format. Inputs to the Python code include the original data file, CSV files with information on how to parse each record type within the data file, and any lookups required to interpret the data, such as transforming an equipment code of "8" into "EKMAN GRAB". The CSV files with information on how to parse each record type were created by referencing parsing instructions provided by NCEI. If a given record type is not included in the actual data, then no output files for that record type are created. This project includes a readme file, original data files from prior investigators, code lookups, CSV files of parsing instructions, optional files created by splitting original data files into separate files by record type, output CSV files created by parsing original data files into separate files by record type, and Python scripts to perform the parsing. The output CSV files represent the dataset produced from this work. Parsing instructions for original data files as well as data codes can be found at https://www.nodc.noaa.gov/access/dataformats.html. Taxon identifiers from the Integrated Taxonomic Information System can be included in the output by the parsing code; full taxonomic information for these identifiers can be retrieved from the ITIS website, https://itis.gov/.

本数据集最初采集于20世纪70年代至80年代初,存档于美国国家海洋数据中心(National Oceanographic Data Center,NODC),采用文本格式存储,其基于列的结构会根据每行文本所代表的数据记录类型有所差异。本次研究采用Python代码对上述文本文件进行解析:代码将按照记录类型将数据拆分至独立文件,并以逗号分隔值(Comma-Separated Values,CSV)格式存储数据。该Python代码的输入项包括原始数据文件、用于说明如何解析数据文件中各类记录的CSV文件,以及解读数据所需的各类查找表——例如将设备代码"8"转换为埃克曼采泥器(EKMAN GRAB)。上述用于说明记录解析规则的CSV文件,是参考美国国家环境信息中心(National Centers for Environmental Information,NCEI)提供的解析指南编制而成。若实际数据中未包含某类记录,则不会生成对应该记录类型的输出文件。本项目涵盖以下内容:自述文件("readme")、既往研究者提供的原始数据文件、代码查找表、解析规则CSV文件、可选的按记录类型拆分原始数据后生成的拆分文件、通过按记录类型解析原始数据得到的输出CSV文件,以及用于执行解析操作的Python脚本。最终输出的CSV文件即为本次工作所生成的数据集。原始数据文件的解析规则及数据代码说明可通过以下网址获取:https://www.nodc.noaa.gov/access/dataformats.html。解析代码可将来自综合分类学信息系统(Integrated Taxonomic Information System,ITIS)的分类单元标识符纳入输出结果;此类标识符的完整分类学信息可通过ITIS官网https://itis.gov/获取。
创建时间:
2021-04-29
二维码
社区交流群
二维码
科研交流群
商业服务