Datasets of estimating spatiotemporally continuous snow water equivalent from intermittent satellite track observations using machine learning methods

Name: Datasets of estimating spatiotemporally continuous snow water equivalent from intermittent satellite track observations using machine learning methods
Creator: Li, Dongyue; Lettenmaier, Dennis P.; Margulis, Steven A.; Ma, Xiaoyu; Fang, Yiwen
Published: 2022-06-09 00:00:00
License: 暂无描述

Figshare2022-06-09 更新2026-04-08 收录

下载链接：

https://figshare.com/articles/dataset/Datasets_of_estimating_spatiotemporally_continuous_snow_water_equivalent_from_intermittent_satellite_track_observations_using_machine_learning_methods/20044424/1

下载链接

链接失效反馈

官方服务：

资源简介：

1. Topography and vegetation cover data in the Upper Tuolumne Watershed dataset: static_variable.mat description: There are four fields in the structure array: elevation, aspect, slope, and fractional vegetation cover. The data are with a spatial resolution of 16 arc-second. Dimension: 5786*1 Dimension 1: the number of pixels in the Upper Tuolumne Watershed. Used to plot Figure 1 and Figure S1 in the paper "Estimating spatiotemporally continuous snow water equivalent from intermittent satellite track observations using machine learning methods" 2. Snow reanalysis data dataset: snow_reanalysis.mat description: Snow water equivalent (SWE) data from the posterior snow reanalysis dataset from WY1985 to WY2022, on a daily time scale with a spatial resolution of 16 arc-second. The original dataset is available over the whole Western U.S., here we extract the data in the Upper Tuolumne River Basin, California. Developed by Fang et al., (2022). 3. Domain-wide SWE estimates dataset: estimation_12yrs.mat description: Track to Area (TTA) SWE data transformation using one statistical and four machine learning methods in four driest years, four normal years, and four wettest years from WY2000 to 2019. This dataset is the domain-wide Apil 1st SWE estimation using the four alternative methods in the 12 years with different climate conditions. Dimension of the dataset: 5786*12*4 Dimension 1: the number of pixels in the Upper Tuolumne Watershed. Dimension 2: 12 years. Dimension 3: four TTA methods. 4. Daily time series of domain-wide SWE estimates dataset: dnn_prediction_*days_gap.mat description: Assuming the temporal interval between two satellite overpasses is 1-, 5-, 10-, 15-, 20-, or 30-day, the daily time series of domain-wide SWE estimation based on the deep neural network (DNN) method for a dry year (WY2015), a normal year (WY2008), and a wet year (WY2017). Dimension of the datasets: 3*5786*366 Dimension 1: three WY years. Dimension 2: the number of pixels in the study area. Dimension 3: days in a water year from Oct 1st. If the year is not a leap year, then the values on day 366 are NANs. Used for the plotting of Figure 6 and Figure 7. 5. Feature sensitivity test dataset: relative_importance_missing_feature.mat; MAE_feature_uncertainty_*.mat description: (1) Missing feature analysis (Figure 8): 7*3; 7 meteorological variables (precipitation, air pressure, net longwave radiation, net shortwave radiation, air temperature, specific humidity, wind speed). (2) Feature uncertainty analysis (meteorological forcings) (Figure 9). 101*7: biases from -50% to 50% (1% as the interval) for a dry (WY2015), a normal (WY2008), and a wet (WY2017) year. 6. sensitivity to the number of ground tracks dataset: MAE_sensitivity_tracks.mat description: The accuracy of domain-wide SWE estimates is expected to increase as there are more overpasses of satellites in the study area. However, the satellite costs may also increase with the addition of ground tracks. We carried out a sensitivity test (estimation accuracy to the number of tracks) to explore the preferred number of ground tracks in the Upper Tuolumne Watershed. This dataset is the result of this sensitivity test assuming that the number of ground tracks changes from 1 to 6 in a dry year (WY2015), a normal year (WY2008), and a wet year (WY2017) based on the four "track-to-area" methods. Dimension: 3*6*4: Dimension 1: three years (WY2015, 2008, and 2017) Dimension 2: number of ground tracks: 1-6 Dimension 4: four different track to area (TTA) methods (MVLR, RF, SVM, and DNN in sequence). This dataset was used to plot Figure 11 in the main rext.

1. **图奥勒米河上游流域地形与植被覆盖数据集**：对应文件为static_variable.mat。该结构数组包含四个字段：高程（elevation）、坡向（aspect）、坡度（slope）与植被覆盖度（fractional vegetation cover）。数据空间分辨率为16角秒。数据维度为5786×1：其中维度1代表图奥勒米河上游流域的像素总数。本数据集用于绘制论文《利用机器学习方法从间歇卫星轨迹观测数据估算时空连续雪水当量》中的图1与补充图S1。 2. **积雪再分析数据集**：对应文件为snow_reanalysis.mat。该数据集包含1985水文年（WY1985）至2022水文年（WY2022）的后处理积雪再分析数据集的雪水当量（snow water equivalent, SWE）数据，时间分辨率为日尺度，空间分辨率为16角秒。原始数据集覆盖美国全境西部，本次研究仅提取了加利福尼亚州图奥勒米河上游流域范围内的数据。本数据集由Fang等人（2022）开发。 3. **全流域雪水当量估算数据集**：对应文件为estimation_12yrs.mat。本数据集基于2000至2019水文年（WY2000至WY2019）中的4个枯水年、4个平水年与4个丰水年，采用1种统计方法与4种机器学习方法完成“轨迹转区域（Track to Area, TTA）”的雪水当量转换。本数据集涵盖了12年不同气候条件下的4种替代方法得到的4月1日全流域雪水当量估算结果。数据集维度为5786×12×4：维度1代表图奥勒米河上游流域的像素总数；维度2代表12个水文年；维度3代表4种轨迹转区域方法。 4. **全流域雪水当量日时序估算数据集**：对应文件为dnn_prediction_*days_gap.mat。本数据集假设卫星过境时间间隔分别为1、5、10、15、20或30天，基于深度神经网络（deep neural network, DNN）方法，针对1个枯水年（2015水文年，WY2015）、1个平水年（2008水文年，WY2008）与1个丰水年（2017水文年，WY2017）生成全流域雪水当量的日时序估算结果。数据集维度为3×5786×366：维度1代表3个水文年；维度2代表研究区像素总数；维度3代表水文年的天数（自10月1日起算）。若为非闰年，则第366天的取值为非数值（NANs）。本数据集用于绘制图6与图7。 5. **特征敏感性测试数据集**：对应文件为relative_importance_missing_feature.mat与MAE_feature_uncertainty_*.mat。该数据集包含两部分内容：(1) 缺失特征分析（对应图8）：维度为7×3，包含7个气象变量（降水、气压、净长波辐射、净短波辐射、气温、比湿与风速）；(2) 气象强迫特征不确定性分析（对应图9）：维度为101×7，针对枯水年（WY2015）、平水年（WY2008）与丰水年（WY2017），偏差范围为-50%至50%（步长为1%）。 6. **地面轨迹数量敏感性测试数据集**：对应文件为MAE_sensitivity_tracks.mat。研究预期随着研究区内卫星过境次数增加，全流域雪水当量估算精度会提升，但卫星观测成本也会随地面轨迹数量增加而上升。本研究开展了“估算精度随地面轨迹数量变化”的敏感性测试，以探索图奥勒米河上游流域的最优地面轨迹数量。本数据集为该敏感性测试的结果：基于4种轨迹转区域（Track to Area, TTA）方法，针对枯水年（WY2015）、平水年（WY2008）与丰水年（WY2017），假设地面轨迹数量从1变化至6时的估算精度结果。数据集维度为3×6×4：维度1代表3个水文年（WY2015、2008与2017）；维度2代表地面轨迹数量（1至6）；维度3代表4种不同的轨迹转区域方法（依次为多元线性回归MVLR、随机森林RF、支持向量机SVM与深度神经网络DNN）。本数据集用于绘制正文图11。

提供机构：

Li, Dongyue; Lettenmaier, Dennis P.; Margulis, Steven A.; Ma, Xiaoyu; Fang, Yiwen

创建时间：

2022-06-09