ArchEGraph/ArchEGraph-demo
收藏Hugging Face2026-04-28 更新2026-05-03 收录
下载链接:
https://hf-mirror.com/datasets/ArchEGraph/ArchEGraph-demo
下载链接
链接失效反馈官方服务:
资源简介:
---
pretty_name: ArchEGraph-demo
license: apache-2.0
task_categories:
- time-series-forecasting
- tabular-regression
- graph-ml
language:
- en
size_categories:
- n<1K
tags:
- building-energy
- weather
- geometry
- npz
- time-series
- graph
---
# ArchEGraph-demo
## Dataset Summary
ArchEGraph-demo is a multimodal building simulation dataset for building-level energy modeling.
The dataset links four modalities:
- Building-level descriptors in `building/` (NPZ)
- Weather features in `weather/` (NPZ)
- Geometry/topology features in `geometry/` (NPZ)
- Hourly energy signals in `energy/` (NPZ)
A manifest file (`manifest.csv`) maps each sample to its corresponding weather/building source and energy file.
## Dataset Structure
Top-level files and folders:
- `manifest.csv`: sample index and metadata table
- `building/`: one NPZ per building ID
- `weather/`: one NPZ per weather/city ID
- `geometry/`: one NPZ per building ID
- `energy/`: one NPZ per (building, city) sample
Observed statistics from `manifest.csv` and folder counts:
- Number of samples: 300
- Number of unique buildings: 75
- Number of unique weather IDs (cities): 48
- `n_steps`: always 8760 (hourly one-year series)
- `n_spaces`: min 2, max 132
For sampled energy files, NPZ keys are:
- `values`: shape `(8760, n_spaces)`
- `columns`: shape `(n_spaces,)`
## Data Fields
Columns in `manifest.csv`:
- `sample_id`: unique sample identifier, pattern `buildingID__City`
- `source_job_tag`: source job tag, usually same as `sample_id`
- `weather_id`: city/weather identifier
- `building_id`: building identifier (integer-like string)
- `energy_file`: filename under `energy/`
- `n_steps`: number of timesteps in energy sequence (8760)
- `n_spaces`: number of simulated spaces/zones
## Intended Uses
- Building energy forecasting and simulation surrogate modeling
- Multimodal learning across weather, building, and geometry data
- Domain adaptation and transfer learning across cities and buildings
- Representation learning for urban building stock analytics
## Out-of-Scope Uses
- Safety-critical operational decisions without validation
- Regulatory compliance without domain expert review
## Data Splits
No official train/validation/test split is provided in this demo release.
Suggested split strategies:
- Building-level holdout (unseen buildings)
- Weather/city-level holdout (unseen climates)
- Temporal holdout within each series
## Data Preprocessing
Recommended baseline preprocessing:
- Normalize weather and energy channels using training split only
- Align modalities by `building_id`, `weather_id`, and `sample_id`
- Handle variable `n_spaces` with padding, masking, or set/graph models
## Licensing
License is currently set to `unknown` and should be updated by the dataset owner if needed.
## Citation
If you use this dataset, please cite the project/paper/repository that released ArchEGraph-demo.
## Croissant Metadata
A machine-readable Croissant metadata file is provided in `croissant.json`.
提供机构:
ArchEGraph



