cat1233211/liander2024-energy-forecasting-benchmark
收藏Hugging Face2026-04-17 更新2026-04-26 收录
下载链接:
https://hf-mirror.com/datasets/cat1233211/liander2024-energy-forecasting-benchmark
下载链接
链接失效反馈官方服务:
资源简介:
---
license: cc-by-4.0
pretty_name: Liander 2024 Short Term Energy Forecasting Benchmark
size_categories:
- 1M<n<10M
source_datasets:
- original
viewer: false
tags:
- climate
- energy
- forecasting
- time-series
- weather
- power-grid
- profiles
- price
- electricity
- load
- demand
- generation
- short-term
task_categories:
- time-series-forecasting
task_ids:
- univariate-time-series-forecasting
- multivariate-time-series-forecasting
---
# Dataset Card for Liander 2024 Short Term Energy Forecasting Benchmark
[](https://huggingface.co/datasets/OpenSTEF/liander2024-energy-forecasting-benchmark)
This dataset provides a benchmark for short term energy forecasting models, combining electrical load measurements from Dutch DSO Liander with predictors like corresponding weather data from OpenMeteo, day-ahead electricity prices from ENTSO-E, and profiles of electricity consumption from Energiedatawijzer. The dataset covers the full year 2024 (2024-01-01 to 2025-01-01 UTC) and includes 55 different points in the grid across the Netherlands at various levels within the grid.
The dataset is designed for developing and validating short-term energy forecasting models, particularly those that incorporate weather variables. It serves as a standardized benchmark for comparing different short term forecasting approaches in the energy domain.
## Dataset Details
- **Curated by:** [OpenSTEF](https://github.com/OpenSTEF)
- **License:** Creative Commons BY 4.0 (CC BY 4.0). See below for specific source data licenses.
- **Data Period:** 2024-01-01 to 2025-01-01
- **Temporal Resolution:** 15-minute intervals for load measurements and profiles, hourly for weather data and prices (interpolated to 15-minute intervals)
- **Geographic Coverage:** 55 points in the grid across the Netherlands in Liander service area
- **Total Size:** ~3-6M data points across all components
## Dataset Components
The dataset consists of six main components:
### 1. Load Measurements (`load_measurements/`)
Electrical load (active power) measurements from various types of infrastructure managed by Dutch DSO Liander. All measurements are recorded at 15-minute intervals.
**Location Types:**
- **mv_feeder** (Medium Voltage Feeders): Outgoing medium voltage cables from primary substations
- **station_installation** (Station Installations): Various primary substation installations
- **transformer** (Transformers): Power transformers at primary substations
- **solar_park** (Solar Parks): Anonymized and normalized individual solar park measurements
- **wind_park** (Wind Parks): Anonymized and normalized individual wind park measurements
> [!NOTE]
> Solar and wind park data includes a 2-day availability delay to simulate data availability constraints at Dutch DSOs.
| Column | Type | Unit | Description |
|:------:|:----:|:----:|:-----------:|
| timestamp | datetime64[ns, UTC] | - | Measurement timestamp in UTC |
| load | float64 | W | Electrical load in watts |
| available_at | datetime64[ns, UTC] | - | Data availability timestamp |
### 2. Weather Measurements (`weather_measurements/`)
Historical weather measurements from OpenMeteo for each load measurement location, providing ground truth weather conditions.
| Column | Type | Unit | Description |
|:------:|:----:|:----:|:-----------:|
| temperature_2m | float32 | °C | Air temperature at 2 meters above ground |
| relative_humidity_2m | float32 | % | Relative humidity at 2 meters above ground |
| surface_pressure | float32 | hPa | Atmospheric pressure at surface level |
| cloud_cover | float32 | % | Total cloud cover as area fraction |
| wind_speed_10m | float32 | km/h | Wind speed at 10 meters above ground |
| wind_direction_10m | float32 | ° | Wind direction at 10 meters above ground |
| shortwave_radiation | float32 | W/m² | Shortwave solar radiation |
| direct_radiation | float32 | W/m² | Direct solar radiation on horizontal plane |
| diffuse_radiation | float32 | W/m² | Diffuse solar radiation |
| direct_normal_irradiance | float32 | W/m² | Direct solar radiation on normal plane |
### 3. Weather Forecasts (`weather_forecasts/`)
Latest available weather forecasts from OpenMeteo (short horizon). These represent the best available forecast at each time point.
> [!WARNING]
> This component is useful for simple forecasting experiments but is not fully realistic for benchmarking since it does not simulate real-world forecast availability.
| Column | Type | Unit | Description |
|:------:|:----:|:----:|:-----------:|
| temperature_2m | float32 | °C | Air temperature at 2 meters above ground |
| relative_humidity_2m | float32 | % | Relative humidity at 2 meters above ground |
| surface_pressure | float32 | hPa | Atmospheric pressure at surface level |
| cloud_cover | float32 | % | Total cloud cover as area fraction |
| wind_speed_10m | float32 | km/h | Wind speed at 10 meters above ground |
| wind_speed_80m | float32 | km/h | Wind speed at 80 meters above ground |
| wind_direction_10m | float32 | ° | Wind direction at 10 meters above ground |
| shortwave_radiation | float32 | W/m² | Shortwave solar radiation |
| direct_radiation | float32 | W/m² | Direct solar radiation on horizontal plane |
| diffuse_radiation | float32 | W/m² | Diffuse solar radiation |
| direct_normal_irradiance | float32 | W/m² | Direct solar radiation on normal plane |
### 4. Versioned Weather Forecasts (`weather_forecasts_versioned/`)
Time-versioned weather forecasts with lead times up to 7 days ahead, simulating real-world data availability. This component provides the most realistic forecasting scenario.
> [!NOTE]
> This enables realistic evaluation where forecasts are only available at specific times with specific lead times, matching real-world operational constraints.
| Column | Type | Unit | Description |
|:------:|:----:|:----:|:-----------:|
| timestamp | datetime64[ns, UTC] | - | Target forecast timestamp |
| available_at | datetime64[ns, UTC] | - | When the forecast was available/created |
| temperature_2m | float32 | °C | Air temperature at 2 meters above ground |
| relative_humidity_2m | float32 | % | Relative humidity at 2 meters above ground |
| surface_pressure | float32 | hPa | Atmospheric pressure at surface level |
| cloud_cover | float32 | % | Total cloud cover as area fraction |
| wind_speed_10m | float32 | km/h | Wind speed at 10 meters above ground |
| wind_speed_80m | float32 | km/h | Wind speed at 80 meters above ground |
| wind_direction_10m | float32 | ° | Wind direction at 10 meters above ground |
| shortwave_radiation | float32 | W/m² | Shortwave solar radiation |
| direct_radiation | float32 | W/m² | Direct solar radiation on horizontal plane |
| diffuse_radiation | float32 | W/m² | Diffuse solar radiation |
| direct_normal_irradiance | float32 | W/m² | Direct solar radiation on normal plane |
### 5. EPEX Day-Ahead Prices (`EPEX.parquet`)
Day-ahead electricity prices for the Netherlands from ENTSO-E Transparency Platform, providing market price signals that influence energy consumption patterns.
| Column | Type | Unit | Description |
|:------:|:----:|:----:|:-----------:|
| timestamp | datetime64[ns, UTC] | - | Price delivery timestamp in UTC |
| available_at | datetime64[ns, UTC] | - | When the price was published/available |
| price | float64 | €/MWh | Day-ahead electricity price in euros per megawatt hour |
### 6. Electricity Consumption Profiles (`profiles.parquet`)
Standardized electricity consumption profiles from Energiedatawijzer for various customer categories in the Netherlands, representing typical usage patterns throughout the year. These values are typically normalized to sum to 1 over the year. There are 15 types of profiles, which can be read as follows: `{category}_{type}_{direction}`, where `category` says something about the connection type, `type` indicates whether it is a connection with or without infeed, and `direction` indicates whether it is a consumption or generation profile (we only include consumption profiles as infeed says something about previous year's generation). For a full description of the profiles, see the [Energiedatawijzer documentation](https://energiedatawijzer.nl/documenten/profielen-elektriciteit-2024/).
| Column | Type | Unit | Description |
|:------:|:----:|:----:|:-----------:|
| timestamp | datetime64[ns, UTC] | - | Profile timestamp in UTC |
| available_at | datetime64[ns, UTC] | - | Data availability timestamp |
| {profiles} | float64 | - | 15 profiles for different categories |
## Uses
This dataset is intended for energy forecasting research, providing a standardized benchmark for comparing different forecasting approaches in the energy domain. The dataset supports various forecasting horizons and scenarios:
- **Operational Forecasting**: 15-minute to 24-hour ahead load predictions
- **Day-ahead Congestion Management**: Using weather forecasts for next-day congestion predictions
- **Multi-modal Forecasting**: Combining multiple infrastructure types and weather variables
- **Uncertainty Quantification**: Using versioned forecasts to assess prediction uncertainty
- **Weather-Energy Relationship Studies**: Analyzing correlations between weather variables and electrical load
This dataset is compatible with various forecasting frameworks, including **[OpenSTEF](https://github.com/OpenSTEF/openstef)** (Open Short Term Energy Forecasting), classical time series models, machine learning approaches, and deep learning models.
## Dataset Structure
The dataset is organized in the following directory structure:
```
liander2024/
├── liander2024_targets.yaml # Location metadata with coordinates
├── load_measurements/ # Electrical load data
│ ├── mv_feeder/ # Medium voltage feeder measurements
│ ├── station_installation/ # Substation installation measurements
│ ├── transformer/ # Transformer measurements
│ ├── solar_park/ # Anonymized solar park measurements
│ └── wind_park/ # Anonymized wind park measurements
├── weather_measurements/ # Historical weather data
│ └── [same subdirectory structure as above]
├── weather_forecasts/ # Latest weather forecasts
│ └── [same subdirectory structure as above]
├── weather_forecasts_versioned/ # Time-versioned weather forecasts
│ └── [same subdirectory structure as above]
├── EPEX.parquet # Day-ahead electricity prices
└── profiles.parquet # Electricity consumption profiles
```
Each subdirectory contains individual Parquet files for each location, named according to the location identifier.
### Target Metadata (`liander2024_targets.yaml`)
The `liander2024_targets.yaml` file contains metadata for all 55 forecasting targets in the dataset. Each target includes:
| Field | Type | Description |
|:-----:|:----:|:-----------:|
| name | string | Unique identifier for the location/asset |
| group_name | string | Infrastructure type: `mv_feeder`, `transformer`, `station_installation`, `solar_park`, or `wind_park` |
| latitude | float | Approximate latitude coordinate* |
| longitude | float | Approximate longitude coordinate* |
| description | string | Human-readable description of the location |
| benchmark_start | datetime | Start of the benchmark evaluation period |
| benchmark_end | datetime | End of the benchmark evaluation period |
| train_start | datetime | Start of the training data period |
| upper_limit | float | 98th percentile of load values (W) |
| lower_limit | float | 2nd percentile of load values (W) |
\* Location coordinates are approximate and only based on the name of the target.
## Dataset Creation
### Source Data
#### Liander Historical Measurements
- **Source**: [Liander Open Data - Historical 15-minute Operational Measurements](https://www.liander.nl/over-ons/open-data#historische-15-minuten-bedrijfsmetingen)
- **License**: [See custom disclaimer](https://www.liander.nl/over-ons/open-data/disclaimer)
- **Description**: 15-minute electrical load measurements from various infrastructure types across Liander's service territory
- **Modifications made**: Converted into standardized Parquet format, removed _normalized suffix from load column, added `available_at` timestamps.
#### OpenMeteo Weather Data
- **Source**: [OpenMeteo Historical Weather API](https://open-meteo.com/)
- **License**: CC BY 4.0
- **Description**: Historical weather measurements and forecasts using the best available weather models
#### ENTSO-E Day-Ahead Prices
- **Source**: [ENTSO-E Transparency Platform](https://newtransparency.entsoe.eu/)
- **License**: CC BY 4.0
- **Description**: Day-ahead electricity prices for the Netherlands (EPEX Spot NL)
- **Modifications made**: Converted into Parquet format, converted to UTC, added `available_at` timestamp based on availability of day ahead prices in NL.
#### Energiedatawijzer Consumption Profiles
- **Source**: [Energiedatawijzer - Profielen elektriciteit 2024](https://energiedatawijzer.nl/documenten/profielen-elektriciteit-2024/)
- **License**: None, but permission granted for use in this dataset
- **Description**: Standardized electricity consumption profiles for various customer categories in the Netherlands
- **Modifications made**: Converted into Parquet format, converted to UTC added `available_at` timestamp, removed infeed profiles, used first hour of the year to fill the last hour of the year to get a full UTC year.
> [!NOTE]
> Location coordinates are approximate and may not represent exact facility locations. Solar and wind park data is normalized and anonymized for privacy. Weather data is interpolated from hourly to 15-minute resolution to match load measurements.
## How to Use
You can load the dataset files directly into pandas dataframes:
```python
import pandas as pd
load_data = pd.read_parquet("hf://datasets/OpenSTEF/liander2024-energy-forecasting-benchmark/load_measurements/mv_feeder/OS Edam.parquet")
weather_data = pd.read_parquet("hf://datasets/OpenSTEF/liander2024-energy-forecasting-benchmark/weather_measurements_versioned/mv_feeder/OS Edam.parquet")
epex = pd.read_parquet("hf://datasets/OpenSTEF/liander2024-energy-forecasting-benchmark/EPEX.parquet")
profiles = pd.read_parquet("hf://datasets/OpenSTEF/liander2024-energy-forecasting-benchmark/profiles.parquet")
```
提供机构:
cat1233211



