Data-driven Reservoir Operation Rules for 450+ Reservoirs in Contiguous United States
收藏DataCite Commons2025-12-12 更新2026-04-25 收录
下载链接:
http://www.hydroshare.org/resource/63add4d5826a4b21a6546c571bdece10
下载链接
链接失效反馈官方服务:
资源简介:
The extensive construction of dams exerts significant human perturbance on river systems and largely changes surface water hydrology. However, reservoir operation has long been simplified or ignored in large-scale hydrological and water resources simulation, partially due to the inaccessibility of operation manuals for most reservoirs. This dataset provides empirical operation rules documented and discussed in Li et al. (https://doi.org/10.1029/2023WR036686), covering 450+ large reservoirs in the Conterminous United States (CONUS), derived from daily inflow and storage records using the machine learning-based generic data-driven operation model (GDROM, Chen et al. 2022, https://doi.org/10.1016/j.advwatres.2022.104274). Among the reservoirs, those mainly operated for flood control take the largest portion (43%), which are primarily located in Eastern and Central United States; followed by flooding control is irrigation (23%), mostly distributed in the Western United States. We also have hydropower reservoirs (17%) primarily located in the Southeastern United States and the Pacific Northwest, water supply reservoirs (9%), recreation reservoirs (5%), and navigation reservoirs (3%) in the various CONUS regions. The majority length of the records is 15+ years, most of which is sufficiently long to contain inter-annual operation patterns and long-term changes.
The dataset contains 1) the daily operation records from multiple data sources used for model training and validation, and 2) derived operation rules, expressed as "if-then" rules, for each of the 450+ reservoirs. The raw data were processed for training the GDROM, including a) computing "net inflow" to replace the observed inflow to account for storage change due to precipitation, evaporation, seepage, and interaction with groundwater (discharge and recharge); b) detecting and removing the dates with missing data to make continuous time series, and c) correcting outliers (e.g., those with abnormal sudden storage changes). In addition, for each of the reservoirs, the inflow, storage, and release are normalized by the maximum historical storage during the observation period, which enables comparing the extracted operation modules among reservoirs with various sizes. The normalization reduces the time required for hyperparameter tuning, especially the minimum impurity decrease, of which the range of candidate values is considerably decreased. The operation rules for each reservoir contain one or multiple representative operation modules and the hydroclimatic conditions under which the modules are applied. Both the modules and the module application conditions are derived from the Decision Tree; the data-driven model composed of the modules and module application conditions are provided as "if-then" statements.
(Update - January 2025) The processed daily operation records for 256 selected reservoirs, each with a minimum of 25 years of data (spanning from 1990 to 2014 or later), are available in another HydroShare repository (Chen and Cai, 2025: http://www.hydroshare.org/resource/092720588e2e4524bf2674235ff69d81).
大坝的大规模建设对河流系统造成了显著的人为扰动,并在很大程度上改变了地表水水文特征。然而,在大规模水文与水资源模拟研究中,水库调度长期以来被简化或忽略,部分原因在于多数水库的调度手册难以获取。本数据集提供了Li等人(https://doi.org/10.1029/2023WR036686)中记载并讨论的经验调度规则,覆盖美国本土(Conterminous United States, CONUS)的450余座大型水库,这些规则基于逐日入库流量与库容记录,通过基于机器学习的通用数据驱动调度模型(GDROM, Chen et al. 2022, https://doi.org/10.1016/j.advwatres.2022.104274)推导得到。在所有水库中,以防洪为核心调度目标的水库占比最高(43%),主要分布于美国东部与中部地区;其次为以灌溉为目标的水库(23%),多集中分布于美国西部;此外还涵盖了以发电为目标的水库(17%,主要分布于美国东南部与太平洋西北地区)、供水水库(9%)、休闲娱乐水库(5%)以及航运水库(3%),分布于美国本土各区域。多数水库的记录时长超过15年,其中大部分时长足够长,能够涵盖年际调度模式与长期水文变化特征。
本数据集包含两部分内容:1)用于模型训练与验证的多源逐日调度记录;2)为450余座水库分别推导得到的调度规则,这些规则以"if-then"(如果-则)形式表示。原始数据经过预处理以用于GDROM的训练,具体包括:a)计算“净入库流量”以替代观测入库流量,用以表征降水、蒸发、渗漏以及与地下水的交互(排泄与补给)所引起的库容变化;b)检测并剔除存在数据缺失的日期,以构建连续时间序列;c)修正异常值(例如库容出现异常突变的记录)。此外,针对每座水库,研究人员将入库流量、库容与下泄流量均采用观测期内的历史最大库容进行归一化处理,这使得不同规模水库间提取得到的调度模块具备可比较性。归一化操作还缩短了超参数调优所需的时间,尤其是针对最小杂质减少量(minimum impurity decrease)这一超参数,其候选值的取值范围得到了大幅缩减。每座水库的调度规则包含一个或多个典型调度模块,以及该模块所适用的水文气候条件。调度模块与模块应用条件均由决策树(Decision Tree)推导得到,由模块及其应用条件构成的数据驱动模型以"if-then"(如果-则)语句形式提供。
【2025年1月更新】另有256座筛选后的水库的预处理逐日调度记录发布于另一个HydroShare知识库中(Chen与Cai, 2025: http://www.hydroshare.org/resource/092720588e2e4524bf2674235ff69d81),每座水库的数据时长至少为25年,时间跨度覆盖1990年至2014年及以后。
提供机构:
Consortium of Universities for the Advancement of Hydrologic Science, Inc
创建时间:
2025-12-12



