five

多模态机器学习驱动下的全球高分辨率蒸散发数据集(1950-2024)

收藏
国家青藏高原科学数据中心2025-09-15 更新2025-10-04 收录
下载链接:
https://data.tpdc.ac.cn/zh-hans/data/e22aef74-4534-4624-8e2e-f4eab129675b
下载链接
链接失效反馈
官方服务:
资源简介:
蒸散发作为陆地表面第二大水文通量,在水、能量和碳循环的耦合过程中扮演着关键角色。然而,现有蒸散发产品由于空间分辨率粗糙、时间跨度有限以及依赖简化假设,存在较大不确定性。本研究开发了一种多模态机器学习框架,通过融合13种包含遥感、机器学习、陆面模型、再分析数据以及462个通量塔观测点ET数据,生成了高分辨率(0.1°、日尺度)、长时序(1950–2024年)的全球ET数据集。该框架首先使用轻梯度提升机(LightGBM)模型将各个ET产品重建为一致的时空分辨率和时间范围,再利用自动机器学习(AutoML)技术结合ERA5-land大气强迫数据和辅助数据作为预测因子进行融合。验证结果表明,该融合产品相比现有数据集有显著改进,在多种生态系统和区域中达到最高精度(KGE = 0.857,RMSE = 0.726 mm/天),并能有效捕捉时空变异性,校正其他数据集中普遍存在的系统性低估偏差,尤其在湿润地区效果明显。所生成的新数据集为区域水文和生态系统方面的水、能量和碳循环评估提供了更可靠的工具,同时所提出的数据集成方法也为融合具有不同特性的数据集提供了重要的研究框架。

Evapotranspiration (ET), the second largest hydrological flux over terrestrial surfaces, plays a critical role in the coupled water, energy, and carbon cycles. However, existing ET products suffer from considerable uncertainties due to their coarse spatial resolutions, limited temporal spans, and reliance on simplified assumptions. This study developed a multimodal machine learning framework that generates a high-resolution (0.1°, daily) global ET dataset covering the period 1950–2024, by integrating 13 ET products derived from remote sensing, machine learning, land surface models, and reanalysis datasets, as well as ET observations from 462 flux towers. The framework first uses the Light Gradient Boosting Machine (LightGBM) model to reconstruct individual ET products into consistent spatiotemporal resolutions and temporal ranges, then employs Automated Machine Learning (AutoML) techniques combined with ERA5-land atmospheric forcing data and auxiliary data as predictors to perform data fusion. Validation results demonstrate that the fused product significantly outperforms existing datasets, achieving the highest accuracy across various ecosystems and regions (KGE = 0.857, RMSE = 0.726 mm/day). It can effectively capture spatiotemporal variability and correct the widespread systematic underestimation bias prevalent in other datasets, particularly in humid regions. The newly generated dataset provides a more reliable tool for evaluating water, energy, and carbon cycles in regional hydrology and ecosystems, while the proposed data integration framework offers an important research paradigm for fusing datasets with distinct characteristics.
提供机构:
魏忠旺,徐清晨,李璐
创建时间:
2025-09-05
搜集汇总
数据集介绍
main_image_url
背景与挑战
背景概述
该数据集是一个全球高分辨率蒸散发数据集,时间覆盖1950-2024年,空间分辨率为0.1°-0.25°,时间分辨率为日尺度。通过多模态机器学习框架融合多种数据源,显著提高了蒸散发数据的精度和时空变异性捕捉能力,适用于区域水文和生态系统研究。
以上内容由遇见数据集搜集并总结生成
二维码
社区交流群
二维码
科研交流群
商业服务