DGGS Benchmark Replication Study - Results Dataset
收藏Zenodo2026-03-07 更新2026-05-26 收录
下载链接:
https://zenodo.org/doi/10.5281/zenodo.18904135
下载链接
链接失效反馈官方服务:
资源简介:
Results Dataset for the DGGS Benchmark Replication Study
Description
This dataset contains the results of a reproducible replication study of the benchmarks presented in:
Law, R.M. & Ardo, J. (2024). "Using a discrete global grid system for a scalable, interoperable, and reproducible system of land-use mapping." Big Earth Data, 9(1), 29-46. DOI: 10.1080/20964471.2024.2429847
The replication validates the paper's two central claims:
Vector benchmark: DGGS provides "orders of magnitude" performance improvement over traditional vector overlay operations
Raster benchmark: DGGS and raster methods show "roughly equivalent performance" for classification tasks
What's New in Version 3.0.0
Version 3.0.0 extends the replication to include HEALPix benchmarks using the healpix-geo library (v0.0.11), which supports both sphere and WGS84 ellipsoid reference surfaces. This is the first version to provide a cross-DGGS unified comparison: H3 vs HEALPix/sphere vs HEALPix/WGS84.
Key new finding: The choice of reference surface (sphere vs WGS84 ellipsoid) has a negligible effect on performance but a large effect on cell assignment accuracy. At mid-latitudes (+48°, e.g. Mediterranean) and high latitudes (+62°, e.g. Scandinavia), 98% and 91% of pixels respectively are assigned to different HEALPix cells depending on whether a sphere or WGS84 ellipsoid is used. This is directly relevant to European EO data (Sentinel/Copernicus, 45–65°N) and provides a strong scientific argument for using geodetically-correct indexing in production workflows.
Files Included
Version 2.0.0 files (H3 / xdggs replication)
File
Description
vector_benchmark.csv
Timing results for vector overlay vs H3 DGGS comparison
raster_benchmark.csv
Timing results for raster vs H3 DGGS comparison
indexing_benchmark.json
Comparison of H3 loop vs xdggs vectorized indexing
system_info.json
Hardware/software environment details for reproducibility
summary.json
Structured summary of results and validation status
benchmark_unified.png
Visualization of all benchmark results (PNG format)
benchmark_unified.pdf
Visualization of all benchmark results (PDF format)
Version 3.0.0 files (HEALPix / healpix-geo extension)
File
Description
vector_benchmark_healpix_geo.csv
Timing results for vector overlay vs HEALPix (sphere and WGS84)
raster_benchmark_healpix_geo.csv
Timing results for raster vs HEALPix (sphere and WGS84)
ellipsoid_analysis.json
Pixel-assignment difference between sphere and WGS84 by latitude band
comparison_table.csv
Unified cross-DGGS comparison table (H3, HEALPix/sphere, HEALPix/WGS84)
comparison_summary.json
Structured summary of cross-DGGS comparison results
comparison.png
Cross-DGGS comparison visualization (PNG format)
comparison.pdf
Cross-DGGS comparison visualization (PDF format)
run_healpix_geo_replication.py
Benchmark script for HEALPix/sphere and HEALPix/WGS84
run_comparison.py
Cross-DGGS unified comparison script (reads all result CSVs)
Benchmark Configuration
Vector Benchmark (Figure 6 replication)
Layers tested: 5, 10, 20, 50
H3 resolution: 14 (matching paper)
HEALPix depth: 9
Method: Voronoi polygons with random point distribution, dissolved by binary value, then overlaid
Raster Benchmark (Figure 7 replication)
Layers tested: 10, 50, 100, 500, 1,000, 5,000, 10,000
H3 / HEALPix resolution: 9
Raster size: 100 × 100 pixels per layer
Method: Neutral Landscape Model (mid-point displacement) with Gaussian smoothing
Classification Logic
Following the paper's methodology, classification uses seven mathematical functions applied to summed layer values:
Prime number test
Perfect number test
Triangular number test
Square number test
Pentagonal number test
Hexagonal number test
Fibonacci number test
The combination of these seven binary outputs produces up to 127 distinct classes.
Key Results
Vector Benchmark (all methods validate the paper's claim)
Layers
DGGS Time (s)
Vector Time (s)
DGGS Speedup
5
~0.02
~0.4
22×
10
~0.03
~2.5
105×
20
~0.05
~27
541×
50
~0.13
~780
5,999×
Conclusion: DGGS is orders of magnitude faster than vector overlay. The speedup increases with layer count because vector overlay creates exponentially more sliver polygons, while DGGS cell count remains fixed.
Cross-DGGS Comparison (new in v3.0.0)
Method
Max speedup vs vector
Crossover point
H3 (sphere)
~5,800×
~5 layers
HEALPix / sphere
~5,691×
~5 layers
HEALPix / WGS84
~5,603×
~5 layers
All three implementations validate the paper's claim. The near-identical speedups and crossover points confirm that the performance advantage is a property of the DGGS paradigm itself (join-on-cell-ID), not of any specific implementation.
Sphere vs WGS84 Ellipsoid Indexing Difference (new in v3.0.0)
Region
Center latitude
Pixels in different cell
Jaccard similarity
Equatorial
0°
27%
0.9951
Mid-latitude (Mediterranean)
+48°
98%
0.9843
High-latitude (Scandinavia)
+62°
91%
0.9868
Arctic
+78°
53%
0.9908
Conclusion: For European EO data (Copernicus/Sentinel, 45–65°N), sphere-based HEALPix indexing assigns almost every pixel to the wrong cell. WGS84 indexing via healpix-geo is strongly recommended for production workflows.
Raster Benchmark
DGGS pre-indexed vs Raster classification: Roughly equivalent performance (within 2–3×)
xdggs vectorized indexing: Significantly faster than H3 loop-based indexing for coordinate-to-cell conversion
Conclusion: For pre-indexed data, DGGS classification performance matches raster, validating the paper's claim.
Understanding the Raster Benchmark Plot
The raster benchmark plot shows four methods:
Line
Method
What it measures
🟠 Raster (baseline)
NumPy array operations
Traditional raster stacking and classification
🟦 DGGS+H3 (reproduction)
H3 loop indexing
Paper's original approach: index each layer with H3, then classify
🟣 DGGS+xdggs (replication)
xdggs vectorized indexing
Alternative approach: index with xdggs, then classify
🟢 DGGS pre-indexed
Read from Parquet
Paper's target scenario: data already indexed to DGGS
Key interpretation:
The DGGS pre-indexed line (green) represents the paper's main use case: data is indexed to DGGS once, then queried many times
The Classification Only subplot (bottom-right) isolates this comparison, showing DGGS and raster are roughly equivalent
The gap between DGGS+H3 and DGGS+xdggs demonstrates the indexing speedup from vectorization
Methodology
What is a DGGS?
A Discrete Global Grid System (DGGS) is a spatial reference system that partitions the Earth's surface into a hierarchical sequence of equal-area cells. Unlike traditional coordinate systems, DGGS provides:
Fixed discretization: Space is divided into a finite number of cells at each resolution level
Hierarchical structure: Cells nest within parent cells, enabling multi-resolution analysis
Unique cell identifiers: Each cell has a unique ID that implicitly encodes its location
This study uses H3 (Uber's hexagonal hierarchical spatial index) and HEALPix (Hierarchical Equal Area isoLatitude Pixelization), widely used in astrophysics and increasingly adopted in Earth observation.
Key insight: When data is indexed to a DGGS, spatial joins become simple attribute joins on cell IDs, avoiding expensive geometric intersection computations.
Reproduction vs Replication
Following established terminology in reproducibility research:
Term
Definition
Implementation in this study
Reproduction
Same methodology, same tools
H3 library + Polars (matching the paper's approach)
Replication
Same methodology, alternative tools
xdggs for vectorized H3 indexing (v2.0.0); healpix-geo for HEALPix sphere+WGS84 (v3.0.0)
The Role of xdggs in Replication (v2.0.0)
xdggs is a Python library that provides Xarray extensions for DGGS operations. It offers an alternative implementation for converting geographic coordinates to DGGS cell IDs.
Performance comparison:
Method
Time per layer
Relative speed
H3 loop
~0.15s
1× (baseline)
xdggs vectorized
~0.001s
~150× faster
The Role of healpix-geo in Replication (v3.0.0, new)
healpix-geo is a Python library (built on the cdshealpix Rust crate) that provides HEALPix indexing on both spherical and ellipsoidal (WGS84, GRS80) reference surfaces. Unlike cdshealpix or astropy, it requires no astronomy dependencies and natively supports geodetically-correct indexing.
This is the first replication study to test HEALPix with a proper WGS84 ellipsoid, motivated by the observation that Copernicus/Sentinel data is acquired predominantly over Europe (45–65°N) where the WGS84 flattening correction is largest.
Software Environment
Python 3.11
H3 v4.x (Uber's hexagonal hierarchical spatial index)
xdggs (vectorized DGGS operations)
healpix-geo v0.0.11 (HEALPix with WGS84 ellipsoid support) (new in v3.0.0)
GeoPandas, Rasterio, NumPy, Pandas, Polars
SciPy (Voronoi tessellation, Gaussian filtering)
How to Reproduce
The complete replication environment is available at: https://github.com/annefou/dggs_replication_2026
Using Docker (Recommended)
docker pull ghcr.io/annefou/dggs_replication_2026:latest
docker run -v $(pwd)/results:/app/results ghcr.io/annefou/dggs_replication_2026:latest
Using Python
git clone https://github.com/annefou/dggs_replication_2026.git
cd dggs_replication_2026
pip install -r requirements.txt
# H3 replication (v2.0.0)
python run_replication.py --all --output results_h3
# HEALPix/healpix-geo replication (v3.0.0, new)
python run_healpix_geo_replication.py --all --output results_healpix_geo
# Cross-DGGS comparison (v3.0.0, new)
python run_comparison.py \
--h3 results_h3 \
--healpix-geo results_healpix_geo \
--output results_comparison
Citation
If you use this dataset, please cite both the original paper and this replication:
Original Paper
@article{law2024dggs,
title={Using a discrete global grid system for a scalable, interoperable,
and reproducible system of land-use mapping},
author={Law, Richard M. and Ardo, James},
journal={Big Earth Data},
volume={9},
number={1},
pages={29--46},
year={2024},
publisher={Taylor \& Francis},
doi={10.1080/20964471.2024.2429847}
}
This Replication Dataset (v3.0.0)
@dataset{fouilloux2026dggs_replication,
author = {Fouilloux, Anne},
title = {{DGGS Benchmark Replication Study: Results Dataset}},
year = {2026},
publisher = {Zenodo},
version = {3.0.0},
doi = {10.5281/zenodo.18343025},
url = {https://doi.org/10.5281/zenodo.18343025}
}
Original Benchmark Code
The original benchmark code from the paper is available at:
Repository: https://github.com/manaakiwhenua/dggsBenchmarks
Version used: v1.1.1
License
This dataset is released under the Creative Commons Attribution 4.0 International (CC BY 4.0) license.
Author
Anne Fouilloux ORCID: 0000-0002-1784-2920 Affiliation: LifeWatch ERIC
Acknowledgments
Richard M. Law and James Ardo (Manaaki Whenua – Landcare Research) for the original research
Uber Technologies for the H3 library
The xdggs development team for vectorized DGGS operations
The healpix-geo development team for WGS84-aware HEALPix indexing (new in v3.0.0)
Related Resources
DGGS and H3
H3 Documentation: https://h3geo.org/
H3 Python API: https://uber.github.io/h3-py/
H3 Resolution Table: https://h3geo.org/docs/core-library/restable/
OGC DGGS Standard: https://www.ogc.org/standard/dggs/
HEALPix and healpix-geo
healpix-geo Documentation: https://healpix-geo.readthedocs.io/ (new in v3.0.0)
healpix-geo PyPI: https://pypi.org/project/healpix-geo/ (new in v3.0.0)
cdshealpix (Rust backend): https://github.com/cds-astro/cds-healpix-python (new in v3.0.0)
xdggs and Related Tools
xdggs Documentation: https://xdggs.readthedocs.io/
xdggs GitHub: https://github.com/xarray-contrib/xdggs
h3ronpy (used by xdggs): https://github.com/nmandery/h3ronpy
Original Research
Original paper: https://doi.org/10.1080/20964471.2024.2429847
Original benchmark code: https://github.com/manaakiwhenua/dggsBenchmarks
vector2dggs tool: https://github.com/manaakiwhenua/vector2dggs
raster2dggs tool: https://github.com/manaakiwhenua/raster2dggs
Reproducibility Resources
FORRT Replication Handbook: https://forrt.org/replication_handbook/
The Turing Way - Reproducibility: https://the-turing-way.netlify.app/reproducible-research/
Dataset generated: March 2026 Replication framework version: 3.0.0
提供机构:
Zenodo
创建时间:
2026-03-07



