WHONDRS River Corridor Sediment and Water Geochemistry and In Situ Sensor Data from Machine-Learning-Informed Sites across the Contiguous United States (v6)
收藏DataCite Commons2026-04-28 更新2024-07-13 收录
下载链接:
https://www.osti.gov/servlets/purl/1923689
下载链接
链接失效反馈官方服务:
资源简介:
This dataset supports a broader study examining hyporheic zone respiration rates to improve predictive models at a contiguous United States (CONUS) scale. The CONUS-Scale Model-Sample Study (CM) was designed following ICON (integrated, coordinated, open, and networked) principles to facilitate a model-experiment (ModEx) iteration approach, leveraging crowdsourced sampling across the CONUS. New machine learning models were created every month to guide sampling locations. Data from the resulting samples were used to test and rebuild the machine learning models for the next round of sampling guidance. Sampling began in April 2022 and ended in October 2023. In addition to the widely distributed CONUS sites, a more spatially focused sampling occurred in the Yakima River Basin, WA in summer 2022. Data from this more spatially intensive sampling occurred under the label “Second Spatial Study (SSS)” and were also included in the machine learning models. Other data types collected from SSS that were not part of CM were published in a separate data package (https://data.ess-dive.lbl.gov/view/doi:10.15485/1969566).
This data package was originally published in February 2023. It was updated in June 2023 (v2; new and modified files); December 2023 (v3; new and modified files); June 2024 (v4; new and modified files); April 2024 (v5; new and modified files); and September 2025 (v6; modified files). See the change history section in the readme for more details.
For details on how to navigate data packages generated by this project, see https://data.ess-dive.lbl.gov/portals/PNNLRiverCorridorSFA/About.
This dataset is comprised of two folders of field photos and videos, one folder of raw Fourier transform ion cyclotron resonance mass spectrometry (FTICR-MS) data and one main data folder containing (1) file-level metadata; (2) data dictionary; (3) field metadata; (4) readme; (5) international generic sample number (IGSN) mapping file; (6) field protocols; (7) a subfolder with sample data; and (8) a subfolder with sensor data. The sample data subfolder contains (1) surface water and sediment dissolved organic carbon (DOC, measured as non-purgeable organic carbon, NPOC) data and averages; (2) surface water and sediment total nitrogen data and averages; (3) surface water major cations and anions and averages; (4) sediment grain size data; (5) sediment iron (II) data and averages; (6) wet sediment mass, dry sediment mass, water mass, and wet sediment volume in incubation and sediment ICR vials; (7) sediment incubation respiration rate data and averages; (8) normalized respiration rate data and averages; (9) methods codes; (10) sediment specific surface area; (11) sediment percent carbon and nitrogen; (12) sediment gravimetric moisture and averages; (15) sediment X-ray diffraction (XRD) data; (16) sediment adenosine triphosphate (ATP) and averages; (17) a subfolder with sediment incubation respiration data, scripts, and plots; (18) surface water and sediment FTICR methods; and (19) a subfolder of 9.4 Tesla (9.4T) FTICR-MS data. This folder contains five subfolders, one containing the sediment .xml data files, one containing the water .xml files, one containing the sediment CoreMS output files, one containing the water CoreMS output files, and the other containing instructions and scripts for processing the files in CoreMS (https://github.com/EMSL-Computing/CoreMS).The sensor data subfolder contains (1) a subfolder with miniDOT dissolved oxygen and temperature data and plots; (2) miniDOT dissolved oxygen and temperature summary data; and (3) miniDOT installation methods. All files are .csv, .pdf, .R, .xml, .d, .html, .Rmd, .py, .cal, .json, .jpg, .jpeg, .png, .mov, or .mp4.
CORRECTION: Carbon and nitrogen content are reported as percentages. The current column headers "01395_C_percent_per_mg" and "01397_N_percent_per_mg" are incorrect. These should read "01395_C_percent" and "01397_N_percent" and will be corrected in the next version of this data package.
We thank the United States Forest Service, Washington Department of Fish and Wildlife, Washington Department of Natural Resources, Cowiche Canyon Conservatory, Washington State Parks and Recreation Commission (Scientific Research Permit #210901), and the Confederated Tribes and Bands of the Yakama Nation for access to field locations where the samples labeled “SSS” were collected. We also thank the Yakama Nation Tribal Council and Yakama Nation Fisheries for working with us to facilitate sample collection and optimization of data usage according to their values and worldview.
WHONDRS consortium members were asked to provide any acknowledgments for the collection of samples labeled “CM” and the following is a list of acknowledgments that were submitted with their corresponding Site IDs: (MART) Research activities were conducted in part on the Wind River Experimental Forest within the Gifford Pinchot National Forest; (MP- 100379) Philadelphia is part of Lenapehoking, the ancestral homelands of the Lenape peoples; (MP-102398) Land surveyed is the ancestral homelands of the Nookhose'iinenno (Arapaho), Tsis tsis'tas (Cheyenne), and Nuuchu (Ute); (MP-100749 and MP- 100747) Georgia Coastal Ecosystem LTER, OCE-1832178; (SP-70 and SP-72) Eastern Shoshone, Shoshone-Bannock; (MP- 102944) Funded by Oregon Watershed Enhancement Board. On the traditional lands of the Confederated Tribes of the Siletz, Confederated Tribes of the Grand Rhonde, and the Clatsop-Nehalem Confederated Tribe; (MP- 100607) Holiday Creek is located on the traditional territory of the Monacan Indian Nation; (SP-45) Lafayette Blue Springs State Park; (MP-102420) NSF DEB-2016749; (MP-100019) New Hampshire Agriculture Experiment Station; (SP-35) Rayonier (land owner; https://www.rayonier.com/); (MP- 101276) US Department of Energy, Office of Science, Biological and Environmental Research, Subsurface Biogeochemical Research, Watershed Dynamics and Evolution SFA at ORNL; (MP- 103224) Watershed Dynamics and Evolution SFA at ORNL; (MP- 101584) Traditional lands of the Oceti Sakowin (Dakota, Lakota, Nakoda) and Anishinaabe Peoples.
本数据集支撑一项聚焦潜流带呼吸速率的广泛研究,以优化美国本土(contiguous United States, CONUS)范围内的预测模型。CONUS尺度模型采样研究(CM)遵循ICON(集成化、协同化、开放化、网络化,integrated, coordinated, open, and networked)原则设计,旨在采用模型-实验(model-experiment, ModEx)迭代方法,依托全美范围内的众包采样开展工作。研究团队每月训练全新的机器学习模型,以此指导采样点位的选取。所得采样数据将用于测试并迭代更新机器学习模型,为下一轮采样指引提供支撑。采样工作始于2022年4月,至2023年10月结束。除广泛分布的美国本土采样点外,2022年夏季在华盛顿州亚基马河流域开展了空间聚焦性更强的采样。该空间密集型采样数据以“第二次空间研究(SSS)”为标识,同样被纳入机器学习模型训练。此外,从SSS采集的未纳入CM研究范畴的其他数据类型,已通过独立数据包发布(https://data.ess-dive.lbl.gov/view/doi:10.15485/1969566)。该数据集最初于2023年2月发布,后续分别于2023年6月(v2版本,新增及修改文件)、2023年12月(v3版本,新增及修改文件)、2024年6月(v4版本,新增及修改文件)以及2024年4月(v5版本,新增及修改文件)完成更新。更多更新细节可参见README文件中的变更历史章节。本数据集包含两类野外照片与视频文件夹、1个原始傅里叶变换离子回旋共振质谱(Fourier transform ion cyclotron resonance mass spectrometry, FTICR-MS)数据文件夹,以及1个主数据文件夹。主数据文件夹包含以下内容:(1) 文件级元数据;(2) 数据字典;(3) 野外元数据;(4) README文件;(5) 国际通用样品编号(international generic sample number, IGSN)映射文件;(6) 野外采样规程;(7) 包含样品数据的子文件夹;(8) 包含传感器数据的子文件夹。样品数据子文件夹包含:(1) 地表水与沉积物的溶解性有机碳(dissolved organic carbon, DOC,以不可吹扫有机碳non-purgeable organic carbon, NPOC形式测定)数据及其平均值;(2) 地表水与沉积物的总氮数据及其平均值;(3) 地表水的主要阳离子与阴离子数据及其平均值;(4) 沉积物粒度数据;(5) 沉积物亚铁(II)数据及其平均值;(6) 培养瓶与沉积物ICR瓶中的湿沉积物质量、干沉积物质量、水质量及湿沉积物体积数据;(7) 沉积物培养呼吸速率数据及其平均值;(8) 归一化呼吸速率数据及其平均值;(9) 方法代码;(10) 沉积物比表面积数据;(11) 沉积物碳氮百分含量数据;(12) 沉积物重量含水率数据及其平均值;(15) 沉积物X射线衍射(X-ray diffraction, XRD)数据;(16) 沉积物三磷酸腺苷(adenosine triphosphate, ATP)数据及其平均值;(17) 包含沉积物培养呼吸数据、脚本与绘图文件的子文件夹;(18) 地表水与沉积物FTICR分析规程;(19) 包含9特斯拉(9T)FTICR-MS数据的子文件夹。该子文件夹包含5个二级子文件夹:分别存放沉积物.xml数据文件、水体.xml数据文件、沉积物CoreMS输出文件、水体CoreMS输出文件,以及存放CoreMS文件处理说明与脚本的子文件夹(https://github.com/EMSL-Computing/CoreMS)。传感器数据子文件夹包含:(1) 包含miniDOT溶解氧与温度数据及绘图文件的子文件夹;(2) miniDOT溶解氧与温度汇总数据;(3) miniDOT安装规程。所有文件格式均为.csv、.pdf、.R、.xml、.d、.html、.Rmd、.py、.cal、.json、.jpg、.jpeg、.png、.mov或.mp4。如需了解本项目生成的数据包的浏览方法,请参见https://data.ess-dive.lbl.gov/portals/PNNLRiverCorridorSFA/About。
提供机构:
River Corridor and Watershed Biogeochemistry SFA
创建时间:
2023-02-09



