five

cassini-team-todo/eu-hydro-master-skeleton

收藏
Hugging Face2026-04-24 更新2026-05-03 收录
下载链接:
https://hf-mirror.com/datasets/cassini-team-todo/eu-hydro-master-skeleton
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: cc-by-4.0 pretty_name: EU-Hydro Master Skeleton (curated GeoParquet) tags: - geospatial - hydrology - europe - geoparquet - copernicus --- # EU-Hydro Master Skeleton Per-basin GeoParquet shards derived from the Copernicus **EU-Hydro v1.3** GeoPackages. Four layers are published — river centerlines, river-surface polygons, inland-water polygons (lakes + wide waters), and river-basin polygons — all reprojected to a common CRS and stripped of admin-only columns for easier querying. ## Contents ``` eu_hydro_master_skeleton_geoparquet/ ├── river_lines/ # River_Net_l MultiLineString ~1.3 M features ├── river_polygons/ # River_Net_p MultiPolygon ~12 k features ├── inland_water/ # InlandWater MultiPolygon ~380 k features ├── river_basins/ # RiverBasins MultiPolygon ~100 features └── manifest.csv # layer, file, source_basin, features merge_euhydro.py # reproducer (extract zip → curate → GeoParquet) ``` Each subdirectory holds one shard per basin: `euhydro_<basin>_v013.geoparquet`. Shards are zstd-compressed and carry a `source_basin` column so you can concat them without losing provenance. ### Layer quick reference | Directory | Source layer | Geometry | Use case | |---|---|---|---| | `river_lines/` | `River_Net_l` | MultiLineString | Network topology, Strahler order, routing | | `river_polygons/` | `River_Net_p` | MultiPolygon | Actual water-surface area of wide rivers — useful for satellite-image overlap (Sentinel-2 NDWI etc.) | | `inland_water/` | `InlandWater` | MultiPolygon | Lakes, reservoirs, and any water body modelled as an area | | `river_basins/` | `RiverBasins` | MultiPolygon | Catchment polygons for aggregating per-basin stats | ### Common properties - **CRS**: `EPSG:3035` (ETRS89 / LAEA Europe) — units are metres, safe for area/length maths without reprojection. - **Dimensions**: 2D. Original Z/M values from EU-Hydro were dropped. - **Columns**: admin-only fields (`BEGLIFEVER`, `ENDLIFEVER`, `UPDAT_BY`, `UPDAT_WHEN`) were removed. All other source columns are preserved. - **Excluded basins**: `fr_guiana`, `fr_islands`, `iceland` — non-continental or overseas. One note: the `hondo` basin has no `river_polygons/` shard. That's expected — the source GPKG's `River_Net_p` layer is empty for this basin (no rivers wide enough to be captured as an area). ## Quick start ```python import geopandas as gpd from huggingface_hub import hf_hub_download path = hf_hub_download( "cassini-team-todo/eu-hydro-master-skeleton", "eu_hydro_master_skeleton_geoparquet/river_polygons/euhydro_shannon_v013.geoparquet", repo_type="dataset", ) shannon_polys = gpd.read_parquet(path) print(shannon_polys.crs, shannon_polys.geom_type.unique(), len(shannon_polys)) ``` Loading a whole layer across all basins: ```python from pathlib import Path import geopandas as gpd, pandas as pd from huggingface_hub import snapshot_download local_root = Path(snapshot_download( "cassini-team-todo/eu-hydro-master-skeleton", repo_type="dataset", allow_patterns="eu_hydro_master_skeleton_geoparquet/inland_water/*.geoparquet", )) shards = sorted((local_root / "eu_hydro_master_skeleton_geoparquet" / "inland_water").glob("*.geoparquet")) inland_eu = pd.concat([gpd.read_parquet(p) for p in shards], ignore_index=True) ``` ## Reproducing The source GeoPackages aren't redistributable through this repo — download them from the Copernicus Land portal (EU-Hydro v1.3, per-basin GPKG zips). Place all `euhydro_*_v013_GPKG.zip` files in the same directory as `merge_euhydro.py` and run: ```bash python merge_euhydro.py ``` This will extract each zip to a temp dir, read the four target layers with `force_2d=True`, drop admin columns, reproject any basin whose CRS differs from the first valid one, and write the output tree next to the script. A `manifest.csv` summarizing every shard is written at the end. ## Source and licensing - **Source**: Copernicus EU-Hydro v1.3 — https://land.copernicus.eu/en/products/eu-hydro - **License**: Copernicus data is free and open (attribution required). This derived dataset is released under CC-BY-4.0; please cite Copernicus EU-Hydro in any downstream use.

授权协议:CC-BY-4.0 美观名称:欧盟水文主骨架(精选地理Parquet (GeoParquet) 格式) 标签: - 地理空间 - 水文学 - 欧洲 - 地理Parquet (GeoParquet) - 哥白尼 (Copernicus) # 欧盟水文主骨架数据集 本数据集为基于哥白尼 (Copernicus) **EU-Hydro v1.3** 数据集的地理包 (GeoPackage) 文件构建的分流域地理Parquet (GeoParquet) 分片。共发布四类图层:河流中心线、河流水面多边形、内陆水域多边形(湖泊与宽阔水体)以及流域多边形;所有图层均统一重投影至公共坐标系,并移除了仅用于行政管理的字段以简化查询。 ## 数据集内容 eu_hydro_master_skeleton_geoparquet/ ├── river_lines/ # 河流中心线(源图层:River_Net_l,几何类型:MultiLineString,约130万个要素) ├── river_polygons/ # 河流面要素(源图层:River_Net_p,几何类型:MultiPolygon,约1.2万个要素) ├── inland_water/ # 内陆水域(源图层:InlandWater,几何类型:MultiPolygon,约38万个要素) ├── river_basins/ # 流域多边形(源图层:RiverBasins,几何类型:MultiPolygon,约100个要素) └── manifest.csv # 图层、文件、源流域、要素数量清单 merge_euhydro.py # 数据集复现脚本(解压源文件→整理格式→导出为地理Parquet (GeoParquet)) 每个子目录均存储对应单个流域的分片文件,命名格式为`euhydro_<流域ID>_v013.geoparquet`。所有分片均采用zstd压缩,并包含`source_basin`字段,可在合并分片时保留数据溯源信息。 ### 图层快速参考 | 目录名 | 源图层名 | 几何类型 | 适用场景 | |---|---|---|---| | `river_lines/` | `River_Net_l` | MultiLineString | 网络拓扑分析、斯特拉勒河级划分、河道演算 | | `river_polygons/` | `River_Net_p` | MultiPolygon | 宽阔河流实际水面面积计算,适用于卫星影像叠加分析(如Sentinel-2的NDWI指数等) | | `inland_water/` | `InlandWater` | MultiPolygon | 湖泊、水库及所有以面状建模的水体 | | `river_basins/` | `RiverBasins` | MultiPolygon | 用于按流域聚合统计数据的汇水区多边形 | ### 通用属性 - **坐标系 (CRS)**:`EPSG:3035`(ETRS89 / 欧洲Lambert方位等面积投影),单位为米,无需额外重投影即可直接进行面积、长度计算。 - **维度**:2维。已移除EU-Hydro源数据中的Z、M维度值。 - **字段**:已删除仅用于行政管理的字段(`BEGLIFEVER`、`ENDLIFEVER`、`UPDAT_BY`、`UPDAT_WHEN`),其余源字段均保留。 - **排除流域**:`fr_guiana`、`fr_islands`、`iceland`——均为非欧洲本土或海外领地流域。 > 注意:`hondo`流域无`river_polygons/`目录下的分片文件,此为正常现象——该流域的源地理包 (GeoPackage) 文件中`River_Net_p`图层为空(无宽度达标可建模为面状的河流)。 ## 快速上手 python import geopandas as gpd from huggingface_hub import hf_hub_download # 下载指定流域的河流面要素分片 path = hf_hub_download( "cassini-team-todo/eu-hydro-master-skeleton", "eu_hydro_master_skeleton_geoparquet/river_polygons/euhydro_shannon_v013.geoparquet", repo_type="dataset", ) shannon_polys = gpd.read_parquet(path) # 打印坐标系、几何类型及要素数量 print(shannon_polys.crs, shannon_polys.geom_type.unique(), len(shannon_polys)) 加载全流域全图层的示例: python from pathlib import Path import geopandas as gpd, pandas as pd from huggingface_hub import snapshot_download # 下载所有内陆水域分片 local_root = Path(snapshot_download( "cassini-team-todo/eu-hydro-master-skeleton", repo_type="dataset", allow_patterns="eu_hydro_master_skeleton_geoparquet/inland_water/*.geoparquet", )) # 读取并合并所有分片 shards = sorted((local_root / "eu_hydro_master_skeleton_geoparquet" / "inland_water").glob("*.geoparquet")) inland_eu = pd.concat([gpd.read_parquet(p) for p in shards], ignore_index=True) ## 数据集复现 本数据集未在仓库中重分发源地理包 (GeoPackage) 文件,请从哥白尼陆地服务门户下载EU-Hydro v1.3的分流域地理包 (GeoPackage) 压缩包。将所有`euhydro_*_v013_GPKG.zip`文件放置于`merge_euhydro.py`脚本所在目录,执行以下命令即可复现本数据集: bash python merge_euhydro.py 该脚本会将每个压缩包解压至临时目录,使用`force_2d=True`参数读取四类目标图层,删除行政管理字段,对坐标系与首个有效图层不一致的流域进行重投影,并将整理后的文件树输出至脚本所在目录。最终会生成`manifest.csv`文件,汇总所有分片的相关信息。 ## 数据源与授权 - **数据源**:哥白尼 (Copernicus) EU-Hydro v1.3 数据集,访问地址:https://land.copernicus.eu/en/products/eu-hydro - **授权协议**:哥白尼数据为免费开源数据(需注明来源)。本衍生数据集采用CC-BY-4.0协议发布,在下游使用中请引用哥白尼欧盟水文数据集。
提供机构:
cassini-team-todo
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作