five

Monthly surface salinity maps of the Mekong Delta (2015--2025) at 30 m resolution

收藏
DataCite Commons2026-05-03 更新2026-05-07 收录
下载链接:
https://zenodo.org/doi/10.5281/zenodo.20003854
下载链接
链接失效反馈
官方服务:
资源简介:
Description This dataset contains monthly surface salinity maps (in practical salinity units, PSU) at 30 m spatial resolution for the Mekong Delta estuary, spanning 2015--2025. The maps are produced from a physics-aware Random Forest model trained on Harmonized Landsat Sentinel-2 (HLS v2) optical reflectance, Sentinel-1 SAR backscatter, hydrological forcing (discharge and tidal statistics), topographic covariates, and discharge--position interaction terms. A square-root target transform with CROCO hydrodynamic model boundary regularization is used to improve spatial transferability. A SAR+forcing+position fallback model fills cloud-masked optical gaps, ensuring spatially complete monthly coverage. The mapped domain covers the tide-influenced distributary network of the Vietnamese Mekong Delta, including the Tien, Hau, Co Chien, Ham Luong, and Cua Dai branches. Dry-season months (November--June) are the primary validated domain; wet-season months (July--October) are extrapolative technical extensions. Data sources The publicly available source data used in this study are listed below: HLS v2 surface reflectance (Landsat + Sentinel-2) — NASA/USGS: [https://lpdaac.usgs.gov/products/hlsv202/](https://lpdaac.usgs.gov/products/hlsv202/) Sentinel-1 GRD IW backscatter — ESA/Copernicus: [https://developers.google.com/earth-engine/datasets/catalog/COPERNICUS_S1_GRD](https://developers.google.com/earth-engine/datasets/catalog/COPERNICUS_S1_GRD) JRC Global Surface Water — EC JRC: [https://global-surface-water.appspot.com/](https://global-surface-water.appspot.com/) MERIT Hydro — Yamazaki et al.: [http://hydro.iis.u-tokyo.ac.jp/~yamadai/MERIT_Hydro/](http://hydro.iis.u-tokyo.ac.jp/~yamadai/MERIT_Hydro/) DeltaDTM — Pronk et al.: [https://doi.org/10.1038/s41597-024-03135-8](https://doi.org/10.1038/s41597-024-03135-8) FES2022 tidal model — LEGOS/Noveltis: [https://www.aviso.altimetry.fr/en/data/products/auxiliary-products/global-tide-fes.html](https://www.aviso.altimetry.fr/en/data/products/auxiliary-products/global-tide-fes.html) CROCO hydrodynamic model — IRD/MIOS, forcing fields from Marchesiello et al. (2019) Mekong discharge gauges — MRC/National records at Kratie, Tan Chau, Chau Doc File inventory 1) Yearly salinity maps (GeoTIFF) Eleven files (one per year, 2015--2025), each containing 12 monthly bands: mekong_salinity_12months_waterbody_YYYY.tif Property Value **Format** GeoTIFF (LZW-compressed, 32-bit float) **Coordinate system** EPSG:4326 (WGS 84) **Spatial resolution** 0.00027 degrees (~30 m) **Bands** 12 (January--December, band index = calendar month) **NoData** NaN **Pixel values** Salinity in PSU, range 0--35 Bands 1--6 (Jan--Jun) and 11--12 (Nov--Dec) correspond to the dry season, the primary validated domain. Bands 7--10 (Jul--Oct) are wet-season extrapolations. 2) Data-quality flags (GeoTIFF) Eleven companion files providing per-pixel prediction flags: mekong_salinity_12months_waterbody_YYYY_flags.tif Flag value Meaning 0 Full model prediction (optical + SAR + forcing available) 1 Fallback model prediction (optical predictors unavailable; SAR + forcing + position only) 2 No prediction (all predictors missing or masked) 3) Per-year metadata (JSON) Eleven metadata files containing per-year processing provenance: mekong_salinity_12months_waterbody_YYYY.metadata.json Fields include: year, raster dimensions, number of valid predicted pixels, finite-pixel fraction, source tile identifier, and cropping bounds. 4) Manifest file Consolidated metadata across all processed years: mekong_salinity_12months_waterbody_manifest.json 5) Station metadata Field observation station locations used for model training and validation: Mekong_Station_Metadata.csv Column Description `station_id` Station name `river_branch` Distributary branch `longitude_degE` Longitude (EPSG:4326) `latitude_degN` Latitude (EPSG:4326) `notes` Additional information 6) Saved trained model A serialized Random Forest regressor trained on the complete 2015--2025 dataset: mekong_salinity_rf_v1.joblib mekong_salinity_rf_v1.json Property Value **Format** joblib (Python 3, scikit-learn) **Size** ~4.6 MB **Target transform** sqrt(salinity) The saved model can be loaded with `joblib.load()` and used to predict salinity for new years from satellite features and hydrological forcing, without requiring access to the raw in-situ observation data. See `scripts/predict_new_data.py` for usage examples. 7) CROCO hydrodynamic cache Pre-computed CROCO 3D surface salinity for 2016, used to generate pseudo-observation boundary anchors: croco_2016_surface_salt_cache.npz Property Value **Format** NumPy .npz archive **Size** ~0.1 MB **Content** Gridded surface salinity for March--September 2016 8) Processing scripts The `scripts/` folder contains the complete Python pipeline for reproducing the analysis or adapting it to other estuarine systems: Script Description `gee_batched_monthly_pipeline.py` Main GEE pooled training, LOSO, and map export launcher `gee_export_all_12_months.py` Full 12-month yearly map export launcher `gee_fast_monthly_loso.py` Feature builder and station sampling logic `gee_forcing_lookup.py` Monthly discharge and tidal forcing lookup `gee_config.py` Central configuration (study area, model parameters, asset paths) `local_add_forcing_features.py` Merge forcing values into matched GEE training tables `local_validate_with_forcing.py` LOSO/LOYO/temporal validation, bootstrap CIs, SHAP, permutation importance `local_prepare_zenodo_clean_exports.py` Merge, clip, compress, and package yearly map exports `local_prepare_waterbody_mask.py` Prepare waterbody mask for GEE and local packaging `local_station_waterbody_qc.py` Audit station locations against waterbody geometry `_generate_croco_pseudo_obs.py` Generate CROCO temporal, ocean, freshwater, and estuarine boundary anchors `train_final_model.py` Train and serialize the production Random Forest model `predict_new_data.py` Predict from new feature data (CSV or raster) using a saved model All scripts require a Google Earth Engine account and the `earthengine` Python API. The GEE project ID in `gee_config.py` should be replaced with the user's own project ID when adapting the workflow. License This dataset is distributed under the Creative Commons Attribution 4.0 International (CC BY 4.0) license. Please cite this dataset in any publications or derivative works. Contact For questions, please contact the corresponding author listed in the associated manuscript.
提供机构:
Zenodo
创建时间:
2026-05-03
二维码
社区交流群
二维码
科研交流群
商业服务