five

ArchEGraph/ArchEGraph-demo

收藏
Hugging Face2026-04-28 更新2026-05-03 收录
下载链接:
https://hf-mirror.com/datasets/ArchEGraph/ArchEGraph-demo
下载链接
链接失效反馈
官方服务:
资源简介:
--- pretty_name: ArchEGraph-demo license: apache-2.0 task_categories: - time-series-forecasting - tabular-regression - graph-ml language: - en size_categories: - n<1K tags: - building-energy - weather - geometry - npz - time-series - graph --- # ArchEGraph-demo ## Dataset Summary ArchEGraph-demo is a multimodal building simulation dataset for building-level energy modeling. The dataset links four modalities: - Building-level descriptors in `building/` (NPZ) - Weather features in `weather/` (NPZ) - Geometry/topology features in `geometry/` (NPZ) - Hourly energy signals in `energy/` (NPZ) A manifest file (`manifest.csv`) maps each sample to its corresponding weather/building source and energy file. ## Dataset Structure Top-level files and folders: - `manifest.csv`: sample index and metadata table - `building/`: one NPZ per building ID - `weather/`: one NPZ per weather/city ID - `geometry/`: one NPZ per building ID - `energy/`: one NPZ per (building, city) sample Observed statistics from `manifest.csv` and folder counts: - Number of samples: 300 - Number of unique buildings: 75 - Number of unique weather IDs (cities): 48 - `n_steps`: always 8760 (hourly one-year series) - `n_spaces`: min 2, max 132 For sampled energy files, NPZ keys are: - `values`: shape `(8760, n_spaces)` - `columns`: shape `(n_spaces,)` ## Data Fields Columns in `manifest.csv`: - `sample_id`: unique sample identifier, pattern `buildingID__City` - `source_job_tag`: source job tag, usually same as `sample_id` - `weather_id`: city/weather identifier - `building_id`: building identifier (integer-like string) - `energy_file`: filename under `energy/` - `n_steps`: number of timesteps in energy sequence (8760) - `n_spaces`: number of simulated spaces/zones ## Intended Uses - Building energy forecasting and simulation surrogate modeling - Multimodal learning across weather, building, and geometry data - Domain adaptation and transfer learning across cities and buildings - Representation learning for urban building stock analytics ## Out-of-Scope Uses - Safety-critical operational decisions without validation - Regulatory compliance without domain expert review ## Data Splits No official train/validation/test split is provided in this demo release. Suggested split strategies: - Building-level holdout (unseen buildings) - Weather/city-level holdout (unseen climates) - Temporal holdout within each series ## Data Preprocessing Recommended baseline preprocessing: - Normalize weather and energy channels using training split only - Align modalities by `building_id`, `weather_id`, and `sample_id` - Handle variable `n_spaces` with padding, masking, or set/graph models ## Licensing License is currently set to `unknown` and should be updated by the dataset owner if needed. ## Citation If you use this dataset, please cite the project/paper/repository that released ArchEGraph-demo. ## Croissant Metadata A machine-readable Croissant metadata file is provided in `croissant.json`.
提供机构:
ArchEGraph
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作