SDWPF: A Dataset for Spatial Dynamic Wind Power Forecasting over a Large Turbine Array
收藏DataCite Commons2025-06-01 更新2024-08-19 收录
下载链接:
https://figshare.com/articles/dataset/SDWPF_dataset/24798654/2
下载链接
链接失效反馈官方服务:
资源简介:
<br><b>Paper</b>This dataset is associated with the paper published in Scientific Data, titled "SDWPF: A Dataset for Spatial Dynamic Wind Power Forecasting over a Large Turbine Array." You can access the paper: <i>https://www.nature.com/articles/s41597-024-03427-5</i>If you find this dataset useful, please consider citing our paper: <b>Scientific Data Paper</b>@article{zhou2024sdwpf, title={SDWPF: A Dataset for Spatial Dynamic Wind Power Forecasting over a Large Turbine Array}, author={Zhou, Jingbo and Lu, Xinjiang and Xiao, Yixiong and Tang, Jian and Su, Jiantao and Li, Yu, and Liu, Ji and Lyu, Junfu and Ma, Yanjun and Dou, Dejing},journal={Scientific Data},volume={11},number={1},pages={649},year={2024},url = {https://doi.org/10.1038/s41597-024-03427-5},publisher={Nature Publishing Group}}<br><b>Baidu KDD Cup Paper</b>@article{zhou2022sdwpf,title={SDWPF: A Dataset for Spatial Dynamic Wind Power Forecasting Challenge at KDD Cup 2022}, author={Zhou, Jingbo and Lu, Xinjiang and Xiao, Yixiong and Su, Jiantao and Lyu, Junfu and Ma, Yanjun and Dou, Dejing}, journal={arXiv preprint arXiv:2208.04360},url = {https://arxiv.org/abs/2208.04360}, year={2022}}<br><b>Background</b>The SDWPF dataset, collected over two years from a wind farm with 134 turbines, details the spatial layout of the turbines and dynamic context factors for each. This dataset was utilized to launch the ACM KDD Cup 2022, attracting registrations from over 2,400 teams worldwide. To facilitate its use, we have released the dataset in two parts: sdwpf_kddcup and sdwpf_full. The sdwpf_kddcup is the original dataset used for the Baidu KDD Cup 2022, comprising both training and test datasets. The sdwpf_full offers a more comprehensive collection, including additional data not available during the KDD Cup, such as weather conditions, dates, and elevation.<b>sdwpf_kddcup</b>The <b><i>sdwpf_kddcup</i></b> dataset is the original dataset used for Baidu KDD Cup 2022 Challenge. The folder structure of sdwpf_kddcup is:<pre><pre>sdwpf_kddcup<br> --- sdwpf_245days_v1.csv<br> --- sdwpf_baidukddcup2022_turb_location.csv<br> --- final_phase_test<br> --- infile<br> --- 0001in.csv<br> --- 0002in.csv<br> --- ...<br> --- outfile<br> --- 0001out.csv<br> --- 0002out.csv<br> --- ...<br></pre></pre>The descriptions of each sub-folder in the sdwpf_kddcup dataset are as follows:<b><i>sdwpf_245days_v1.csv</i></b>: This dataset, released for the KDD Cup 2022 challenge, includes data spanning 245 days.<b><i>sdwpf_baidukddcup2022_turb_location.csv</i></b>: This file provides the relative positions of all wind turbines within the dataset.<b><i>final_phase_test</i></b>: This dataset serves as the test data for the final phase of the Baidu KDD Cup. It allows for a comparison of methodologies against those of the award-winning teams from KDD Cup 2022. It includes an 'infile' folder containing input data for the model, and an 'outfile' folder which holds the ground truth for the corresponding output. In other words, for a model function y = f(x), x represents the files in the 'infile' folder, and the ground truth of y corresponds to files in the 'outfile' folder, such as <b><i>{001out} = f({001in})</i></b>.More information about the sdwpf_kddcup used for Baidu KDD Cup 2022 can be found in Baidu KDD Cup Paper: <i>SDWPF: A Dataset for Spatial Dynamic Wind Power Forecasting Challenge at KDD Cup 2022</i><br><b>sdwpf_full</b>The <b><i>sdwpf_full</i></b> dataset offers more information than what was released for the KDD Cup 2022. It includes not only SCADA data but also weather data such as relative humidity, wind speed, and wind direction, sourced from the Fifth Generation of the European Centre for Medium-Range Weather Forecasts (ECMWF) atmospheric reanalyses of the global climate (ERA5). The dataset encompasses data collected over two years from a wind farm with 134 wind turbines, covering the period from January 2020 to December 2021. The folder structure of sdwpf_full is:<pre><pre>sdwpf_full<br>--- sdwpf_turb_location_elevation.csv<br>--- sdwpf_2001_2112_full.csv<br>--- sdwpf_2001_2112_full.parquet<br></pre></pre>The descriptions of each sub-folder in the sdwpf_full dataset are as follows:<b><i>sdwpf_turb_location_elevation.csv</i></b>: This file details the relative positions and elevations of all wind turbines within the dataset.<b><i>sdwpf_2001_2112_full.csv</i></b>: This dataset includes data collected two years from a wind farm containing 134 wind turbines, spanning from Jan. 2020 to Dec. 2021. It offers comprehensive enhancements over the sdwpf_kddcup/sdwpf_245days_v1.csv, including:Extended time span: It spans two years, from January 2020 to December 2021, whereas sdwpf_245days_v1.csv covers only 245 days.Enriched weather information: This includes additional data such as relative humidity, wind speed, and wind direction, sourced from the Fifth generation of the European Centre for Medium-Range Weather Forecasts (ECMWF) atmospheric reanalyses of the global climate (ERA5).Expanded temporal details: Unlike during the KDD Cup Challenge where timestamp information was withheld to prevent data linkage, this version includes specific timestamps for each data point.<b><i>sdwpf_2001_2112_full.parquet</i></b>: This dataset is identical to sdwpf_2001_2112_full.csv, but in a different data format.<br>
<br><b>论文</b>本数据集关联一篇发表于《科学数据(Scientific Data)》期刊的论文,标题为《SDWPF:大型风电机组阵列空间动态风电功率预测数据集》。您可通过以下链接访问该论文:<i>https://www.nature.com/articles/s41597-024-03427-5</i>。若您认为本数据集对研究有所助益,请引用我们的论文:<b>科学数据论文</b>@article{zhou2024sdwpf, title={SDWPF: A Dataset for Spatial Dynamic Wind Power Forecasting over a Large Turbine Array}, author={Zhou, Jingbo and Lu, Xinjiang and Xiao, Yixiong and Tang, Jian and Su, Jiantao and Li, Yu, and Liu, Ji and Lyu, Junfu and Ma, Yanjun and Dou, Dejing},journal={Scientific Data},volume={11},number={1},pages={649},year={2024},url = {https://doi.org/10.1038/s41597-024-03427-5},publisher={Nature Publishing Group}}<br><b>百度KDD杯论文</b>@article{zhou2022sdwpf,title={SDWPF: A Dataset for Spatial Dynamic Wind Power Forecasting Challenge at KDD Cup 2022}, author={Zhou, Jingbo and Lu, Xinjiang and Xiao, Yixiong and Su, Jiantao and Lyu, Junfu and Ma, Yanjun and Dou, Dejing}, journal={arXiv preprint arXiv:2208.04360},url = {https://arxiv.org/abs/2208.04360}, year={2022}}<br><b>数据集背景</b>SDWPF数据集采集自一座拥有134台风电机组的风电场,耗时两年,详细记录了风电机组的空间布局与每台机组的动态关联因素。该数据集被用于发起ACM知识发现与数据挖掘杯(ACM KDD Cup)2022赛事,吸引了全球超过2400支团队报名参赛。为便于使用,我们将该数据集拆分为两个版本发布:sdwpf_kddcup与sdwpf_full。其中sdwpf_kddcup为百度KDD杯2022赛事使用的原始数据集,包含训练集与测试集;sdwpf_full则为更全面的版本,收录了赛事期间未公开的额外数据,例如气象条件、日期信息与海拔高度。<br><b>sdwpf_kddcup</b>sdwpf_kddcup数据集为百度KDD杯2022挑战赛使用的原始数据集,其目录结构如下:<pre>sdwpf_kddcup<br> --- sdwpf_245days_v1.csv<br> --- sdwpf_baidukddcup2022_turb_location.csv<br> --- final_phase_test<br> --- infile<br> --- 0001in.csv<br> --- 0002in.csv<br> --- ...<br> --- outfile<br> --- 0001out.csv<br> --- 0002out.csv<br> --- ...</pre>sdwpf_kddcup数据集各子项的说明如下:<b><i>sdwpf_245days_v1.csv</i></b>:该数据集为KDD杯2022挑战赛发布的赛事数据集,涵盖245天的观测数据。<b><i>sdwpf_baidukddcup2022_turb_location.csv</i></b>:该文件给出了数据集中所有风电机组的相对位置信息。<b><i>final_phase_test</i></b>:该数据集为百度KDD杯2022总决赛的测试数据,可用于将参赛模型的方法与2022年KDD杯获奖团队的方案进行对比。其包含'infile'与'outfile'两个子目录:'infile'文件夹存放模型的输入数据,'outfile'文件夹存放对应输出的真实标签(ground truth)。换言之,对于模型函数y = f(x),'infile'文件夹中的文件对应输入x,'outfile'文件夹中的文件则对应输出y的真实标签,例如`001out.csv = f(001in.csv)`。如需了解百度KDD杯2022所用sdwpf_kddcup数据集的更多信息,请参考百度KDD杯论文:<i>SDWPF: A Dataset for Spatial Dynamic Wind Power Forecasting Challenge at KDD Cup 2022</i>。<br><b>sdwpf_full</b>sdwpf_full数据集包含比2022年KDD杯赛事更多的信息:除监控与数据采集(Supervisory Control And Data Acquisition, SCADA)数据外,还包含相对湿度、风速、风向等气象数据,这些数据源自欧洲中期天气预报中心(European Centre for Medium-Range Weather Forecasts, ECMWF)第五代全球气候大气再分析数据集(ERA5)。该数据集采集自一座拥有134台风电机组的风电场,涵盖2020年1月至2021年12月共两年的观测数据。sdwpf_full的目录结构如下:<pre>sdwpf_full<br>--- sdwpf_turb_location_elevation.csv<br>--- sdwpf_2001_2112_full.csv<br>--- sdwpf_2001_2112_full.parquet</pre>sdwpf_full数据集各子项的说明如下:<b><i>sdwpf_turb_location_elevation.csv</i></b>:该文件详细记录了数据集中所有风电机组的相对位置与海拔高度信息。<b><i>sdwpf_2001_2112_full.csv</i></b>:该数据集涵盖2020年1月至2021年12月共两年的观测数据,采集自一座拥有134台风电机组的风电场。相较于sdwpf_kddcup/sdwpf_245days_v1.csv,该数据集实现了多维度的完善:1. 更长的时间跨度:覆盖两年时长,而sdwpf_245days_v1.csv仅包含245天的数据;2. 更丰富的气象信息:新增了源自欧洲中期天气预报中心(ECMWF)第五代全球气候大气再分析数据集(ERA5)的相对湿度、风速、风向等气象数据;3. 更完整的时间细节:与KDD杯挑战赛期间为防止数据关联而隐藏时间戳的设置不同,该版本为每条数据点都添加了具体的时间戳信息。<b><i>sdwpf_2001_2112_full.parquet</i></b>:该数据集与sdwpf_2001_2112_full.csv内容完全一致,仅数据存储格式不同。
提供机构:
figshare
创建时间:
2024-04-30
搜集汇总
数据集介绍

以上内容由遇见数据集搜集并总结生成



