GlacioVision Dataset
收藏DataCite Commons2026-05-05 更新2026-05-07 收录
下载链接:
https://zenodo.org/doi/10.5281/zenodo.19305311
下载链接
链接失效反馈官方服务:
资源简介:
This dataset presents a comprehensive, multi-modal collection of glacier-related data for three glaciers of Pakistan, which include Siachen, Baltoro, and Chiantar, over the period 2019-2024. It is designed to support research in glaciology, remote sensing, climate science, and machine learning, particularly for tasks such as glacier segmentation, elevation modeling, and multi-modal data fusion.
The dataset integrates three primary data sources:
Satellite imagery from Google Earth Engine (Sentinel-1 and Sentinel-2)
Climate reanalysis data from Copernicus Atmosphere Data Store (CAMS EAC4)
Elevation measurements from NASA Earthdata (GEDI L2A)
These data sources are processed into spatially aligned formats and organized into three complementary dataset representations: raw data, patch-based data, and inference-ready data. Additionally, the full data processing pipeline is provided to ensure reproducibility.
Dataset Contents
The dataset is organized into the following main components:
1. Patch-Based Dataset (Primary)
A machine learning–ready dataset consisting of spatial patches extracted from multi-modal raster data.
Patch size: 256 × 256 pixels
Overlap: 50%
Inputs:
Multi-band satellite + Elevation imagery (10 bands)
Climate data (year-wise)
Outputs:
Glacier segmentation masks
Elevation patches
This format is optimized for deep learning applications.
2. Raw Dataset
Full-resolution, spatially continuous data before patch extraction.
Multi-band satellite + Elevation raster
Glacier masks
Elevation rasters
Climate CSV files
This representation preserves original spatial structure and is suitable for geospatial analysis.
3. Inference Dataset
Structured dataset for model evaluation and real-world deployment scenarios.
4. Processing Pipeline
All scripts used for:
Climate data extraction and preprocessing
Satellite data acquisition and compositing
Elevation extraction and interpolation
Raster stacking and alignment
Patch generation
Data Characteristics
Temporal Coverage
2019-2024
Spatial Coverage
Three glacier regions (Siachen, Baltoro, Chiantar)
Spatial Resolution
25 meters (satellite and stacked data)
Coordinate Reference System
EPSG:32643
Data Sources and Methodology
Climate Data
Climate variables are derived from CAMS global reanalysis (EAC4). Data is extracted for each glacier region using bounding box coordinates and processed into yearly inputs using a hydrological year window:
October (previous year) to September (current year)
Variables include temperature, pressure, humidity, snow properties, and atmospheric parameters.
Satellite Data
Satellite imagery is obtained via Google Earth Engine:
Sentinel-2: Optical bands (B2, B3, B4, B8, B11, B12) with NDSI
Sentinel-1: SAR bands (VV, VH)
Data is filtered for cloud cover (<30%) and aggregated using median compositing over June–September.
Elevation Data
Elevation data is derived from GEDI L2A measurements:
Extracted from HDF5 files
Filtered using quality and degradation flags
Interpolated into continuous Elevation maps using IDW (Inverse Distance Weighting)
Data Integration
All data sources are:
Reprojected to a common CRS (EPSG:32643)
Resampled to consistent resolution
Spatially aligned and stacked into multi-band rasters
Final stacked raster includes:
Sentinel-2 bands
NDSI
Sentinel-1 bands
Elevation band
Use Cases
This dataset supports:
Glacier segmentation (semantic segmentation)
Elevation prediction and reconstruction
Multi-modal deep learning
Climate-glacier interaction studies
Temporal analysis of glacier dynamics
File Formats
GeoTIFF (.tif): Satellite imagery, Elevation maps, masks
CSV (.csv): Climate data, extracted elevation points
Python scripts (.py): Processing pipeline
Keywords
Glacier Monitoring, Remote Sensing, Sentinel-1, Sentinel-2, Elevation maps, Climate Data, Multi-Modal Dataset, Deep Learning, Geospatial Data, Time Series
Notes
Climate data is shared per year and reused across patches
Elevation data is interpolated and may include smoothing artifacts
Satellite data is limited to summer months (June-September)
Dataset includes overlapping patches for improved model training
Acknowledgments
We acknowledge:
Copernicus Atmosphere Data Store (CAMS)
Google Earth Engine
NASA GEDI mission
Open-source geospatial libraries (Rasterio, Pandas, QGIS)
提供机构:
Zenodo
创建时间:
2026-03-29



