five

cat1233211/liander2024-energy-forecasting-benchmark

收藏
Hugging Face2026-04-17 更新2026-04-26 收录
下载链接:
https://hf-mirror.com/datasets/cat1233211/liander2024-energy-forecasting-benchmark
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: cc-by-4.0 pretty_name: Liander 2024 Short Term Energy Forecasting Benchmark size_categories: - 1M<n<10M source_datasets: - original viewer: false tags: - climate - energy - forecasting - time-series - weather - power-grid - profiles - price - electricity - load - demand - generation - short-term task_categories: - time-series-forecasting task_ids: - univariate-time-series-forecasting - multivariate-time-series-forecasting --- # Dataset Card for Liander 2024 Short Term Energy Forecasting Benchmark [![Hugging Face Dataset](https://img.shields.io/badge/🤗%20Hugging%20Face-Dataset-blue)](https://huggingface.co/datasets/OpenSTEF/liander2024-energy-forecasting-benchmark) This dataset provides a benchmark for short term energy forecasting models, combining electrical load measurements from Dutch DSO Liander with predictors like corresponding weather data from OpenMeteo, day-ahead electricity prices from ENTSO-E, and profiles of electricity consumption from Energiedatawijzer. The dataset covers the full year 2024 (2024-01-01 to 2025-01-01 UTC) and includes 55 different points in the grid across the Netherlands at various levels within the grid. The dataset is designed for developing and validating short-term energy forecasting models, particularly those that incorporate weather variables. It serves as a standardized benchmark for comparing different short term forecasting approaches in the energy domain. ## Dataset Details - **Curated by:** [OpenSTEF](https://github.com/OpenSTEF) - **License:** Creative Commons BY 4.0 (CC BY 4.0). See below for specific source data licenses. - **Data Period:** 2024-01-01 to 2025-01-01 - **Temporal Resolution:** 15-minute intervals for load measurements and profiles, hourly for weather data and prices (interpolated to 15-minute intervals) - **Geographic Coverage:** 55 points in the grid across the Netherlands in Liander service area - **Total Size:** ~3-6M data points across all components ## Dataset Components The dataset consists of six main components: ### 1. Load Measurements (`load_measurements/`) Electrical load (active power) measurements from various types of infrastructure managed by Dutch DSO Liander. All measurements are recorded at 15-minute intervals. **Location Types:** - **mv_feeder** (Medium Voltage Feeders): Outgoing medium voltage cables from primary substations - **station_installation** (Station Installations): Various primary substation installations - **transformer** (Transformers): Power transformers at primary substations - **solar_park** (Solar Parks): Anonymized and normalized individual solar park measurements - **wind_park** (Wind Parks): Anonymized and normalized individual wind park measurements > [!NOTE] > Solar and wind park data includes a 2-day availability delay to simulate data availability constraints at Dutch DSOs. | Column | Type | Unit | Description | |:------:|:----:|:----:|:-----------:| | timestamp | datetime64[ns, UTC] | - | Measurement timestamp in UTC | | load | float64 | W | Electrical load in watts | | available_at | datetime64[ns, UTC] | - | Data availability timestamp | ### 2. Weather Measurements (`weather_measurements/`) Historical weather measurements from OpenMeteo for each load measurement location, providing ground truth weather conditions. | Column | Type | Unit | Description | |:------:|:----:|:----:|:-----------:| | temperature_2m | float32 | °C | Air temperature at 2 meters above ground | | relative_humidity_2m | float32 | % | Relative humidity at 2 meters above ground | | surface_pressure | float32 | hPa | Atmospheric pressure at surface level | | cloud_cover | float32 | % | Total cloud cover as area fraction | | wind_speed_10m | float32 | km/h | Wind speed at 10 meters above ground | | wind_direction_10m | float32 | ° | Wind direction at 10 meters above ground | | shortwave_radiation | float32 | W/m² | Shortwave solar radiation | | direct_radiation | float32 | W/m² | Direct solar radiation on horizontal plane | | diffuse_radiation | float32 | W/m² | Diffuse solar radiation | | direct_normal_irradiance | float32 | W/m² | Direct solar radiation on normal plane | ### 3. Weather Forecasts (`weather_forecasts/`) Latest available weather forecasts from OpenMeteo (short horizon). These represent the best available forecast at each time point. > [!WARNING] > This component is useful for simple forecasting experiments but is not fully realistic for benchmarking since it does not simulate real-world forecast availability. | Column | Type | Unit | Description | |:------:|:----:|:----:|:-----------:| | temperature_2m | float32 | °C | Air temperature at 2 meters above ground | | relative_humidity_2m | float32 | % | Relative humidity at 2 meters above ground | | surface_pressure | float32 | hPa | Atmospheric pressure at surface level | | cloud_cover | float32 | % | Total cloud cover as area fraction | | wind_speed_10m | float32 | km/h | Wind speed at 10 meters above ground | | wind_speed_80m | float32 | km/h | Wind speed at 80 meters above ground | | wind_direction_10m | float32 | ° | Wind direction at 10 meters above ground | | shortwave_radiation | float32 | W/m² | Shortwave solar radiation | | direct_radiation | float32 | W/m² | Direct solar radiation on horizontal plane | | diffuse_radiation | float32 | W/m² | Diffuse solar radiation | | direct_normal_irradiance | float32 | W/m² | Direct solar radiation on normal plane | ### 4. Versioned Weather Forecasts (`weather_forecasts_versioned/`) Time-versioned weather forecasts with lead times up to 7 days ahead, simulating real-world data availability. This component provides the most realistic forecasting scenario. > [!NOTE] > This enables realistic evaluation where forecasts are only available at specific times with specific lead times, matching real-world operational constraints. | Column | Type | Unit | Description | |:------:|:----:|:----:|:-----------:| | timestamp | datetime64[ns, UTC] | - | Target forecast timestamp | | available_at | datetime64[ns, UTC] | - | When the forecast was available/created | | temperature_2m | float32 | °C | Air temperature at 2 meters above ground | | relative_humidity_2m | float32 | % | Relative humidity at 2 meters above ground | | surface_pressure | float32 | hPa | Atmospheric pressure at surface level | | cloud_cover | float32 | % | Total cloud cover as area fraction | | wind_speed_10m | float32 | km/h | Wind speed at 10 meters above ground | | wind_speed_80m | float32 | km/h | Wind speed at 80 meters above ground | | wind_direction_10m | float32 | ° | Wind direction at 10 meters above ground | | shortwave_radiation | float32 | W/m² | Shortwave solar radiation | | direct_radiation | float32 | W/m² | Direct solar radiation on horizontal plane | | diffuse_radiation | float32 | W/m² | Diffuse solar radiation | | direct_normal_irradiance | float32 | W/m² | Direct solar radiation on normal plane | ### 5. EPEX Day-Ahead Prices (`EPEX.parquet`) Day-ahead electricity prices for the Netherlands from ENTSO-E Transparency Platform, providing market price signals that influence energy consumption patterns. | Column | Type | Unit | Description | |:------:|:----:|:----:|:-----------:| | timestamp | datetime64[ns, UTC] | - | Price delivery timestamp in UTC | | available_at | datetime64[ns, UTC] | - | When the price was published/available | | price | float64 | €/MWh | Day-ahead electricity price in euros per megawatt hour | ### 6. Electricity Consumption Profiles (`profiles.parquet`) Standardized electricity consumption profiles from Energiedatawijzer for various customer categories in the Netherlands, representing typical usage patterns throughout the year. These values are typically normalized to sum to 1 over the year. There are 15 types of profiles, which can be read as follows: `{category}_{type}_{direction}`, where `category` says something about the connection type, `type` indicates whether it is a connection with or without infeed, and `direction` indicates whether it is a consumption or generation profile (we only include consumption profiles as infeed says something about previous year's generation). For a full description of the profiles, see the [Energiedatawijzer documentation](https://energiedatawijzer.nl/documenten/profielen-elektriciteit-2024/). | Column | Type | Unit | Description | |:------:|:----:|:----:|:-----------:| | timestamp | datetime64[ns, UTC] | - | Profile timestamp in UTC | | available_at | datetime64[ns, UTC] | - | Data availability timestamp | | {profiles} | float64 | - | 15 profiles for different categories | ## Uses This dataset is intended for energy forecasting research, providing a standardized benchmark for comparing different forecasting approaches in the energy domain. The dataset supports various forecasting horizons and scenarios: - **Operational Forecasting**: 15-minute to 24-hour ahead load predictions - **Day-ahead Congestion Management**: Using weather forecasts for next-day congestion predictions - **Multi-modal Forecasting**: Combining multiple infrastructure types and weather variables - **Uncertainty Quantification**: Using versioned forecasts to assess prediction uncertainty - **Weather-Energy Relationship Studies**: Analyzing correlations between weather variables and electrical load This dataset is compatible with various forecasting frameworks, including **[OpenSTEF](https://github.com/OpenSTEF/openstef)** (Open Short Term Energy Forecasting), classical time series models, machine learning approaches, and deep learning models. ## Dataset Structure The dataset is organized in the following directory structure: ``` liander2024/ ├── liander2024_targets.yaml # Location metadata with coordinates ├── load_measurements/ # Electrical load data │ ├── mv_feeder/ # Medium voltage feeder measurements │ ├── station_installation/ # Substation installation measurements │ ├── transformer/ # Transformer measurements │ ├── solar_park/ # Anonymized solar park measurements │ └── wind_park/ # Anonymized wind park measurements ├── weather_measurements/ # Historical weather data │ └── [same subdirectory structure as above] ├── weather_forecasts/ # Latest weather forecasts │ └── [same subdirectory structure as above] ├── weather_forecasts_versioned/ # Time-versioned weather forecasts │ └── [same subdirectory structure as above] ├── EPEX.parquet # Day-ahead electricity prices └── profiles.parquet # Electricity consumption profiles ``` Each subdirectory contains individual Parquet files for each location, named according to the location identifier. ### Target Metadata (`liander2024_targets.yaml`) The `liander2024_targets.yaml` file contains metadata for all 55 forecasting targets in the dataset. Each target includes: | Field | Type | Description | |:-----:|:----:|:-----------:| | name | string | Unique identifier for the location/asset | | group_name | string | Infrastructure type: `mv_feeder`, `transformer`, `station_installation`, `solar_park`, or `wind_park` | | latitude | float | Approximate latitude coordinate* | | longitude | float | Approximate longitude coordinate* | | description | string | Human-readable description of the location | | benchmark_start | datetime | Start of the benchmark evaluation period | | benchmark_end | datetime | End of the benchmark evaluation period | | train_start | datetime | Start of the training data period | | upper_limit | float | 98th percentile of load values (W) | | lower_limit | float | 2nd percentile of load values (W) | \* Location coordinates are approximate and only based on the name of the target. ## Dataset Creation ### Source Data #### Liander Historical Measurements - **Source**: [Liander Open Data - Historical 15-minute Operational Measurements](https://www.liander.nl/over-ons/open-data#historische-15-minuten-bedrijfsmetingen) - **License**: [See custom disclaimer](https://www.liander.nl/over-ons/open-data/disclaimer) - **Description**: 15-minute electrical load measurements from various infrastructure types across Liander's service territory - **Modifications made**: Converted into standardized Parquet format, removed _normalized suffix from load column, added `available_at` timestamps. #### OpenMeteo Weather Data - **Source**: [OpenMeteo Historical Weather API](https://open-meteo.com/) - **License**: CC BY 4.0 - **Description**: Historical weather measurements and forecasts using the best available weather models #### ENTSO-E Day-Ahead Prices - **Source**: [ENTSO-E Transparency Platform](https://newtransparency.entsoe.eu/) - **License**: CC BY 4.0 - **Description**: Day-ahead electricity prices for the Netherlands (EPEX Spot NL) - **Modifications made**: Converted into Parquet format, converted to UTC, added `available_at` timestamp based on availability of day ahead prices in NL. #### Energiedatawijzer Consumption Profiles - **Source**: [Energiedatawijzer - Profielen elektriciteit 2024](https://energiedatawijzer.nl/documenten/profielen-elektriciteit-2024/) - **License**: None, but permission granted for use in this dataset - **Description**: Standardized electricity consumption profiles for various customer categories in the Netherlands - **Modifications made**: Converted into Parquet format, converted to UTC added `available_at` timestamp, removed infeed profiles, used first hour of the year to fill the last hour of the year to get a full UTC year. > [!NOTE] > Location coordinates are approximate and may not represent exact facility locations. Solar and wind park data is normalized and anonymized for privacy. Weather data is interpolated from hourly to 15-minute resolution to match load measurements. ## How to Use You can load the dataset files directly into pandas dataframes: ```python import pandas as pd load_data = pd.read_parquet("hf://datasets/OpenSTEF/liander2024-energy-forecasting-benchmark/load_measurements/mv_feeder/OS Edam.parquet") weather_data = pd.read_parquet("hf://datasets/OpenSTEF/liander2024-energy-forecasting-benchmark/weather_measurements_versioned/mv_feeder/OS Edam.parquet") epex = pd.read_parquet("hf://datasets/OpenSTEF/liander2024-energy-forecasting-benchmark/EPEX.parquet") profiles = pd.read_parquet("hf://datasets/OpenSTEF/liander2024-energy-forecasting-benchmark/profiles.parquet") ```
提供机构:
cat1233211
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作