ELMAS dataset
收藏DataCite Commons2025-06-01 更新2024-08-18 收录
下载链接:
https://figshare.com/articles/dataset/ELMAS_dataset/23889780/1
下载链接
链接失效反馈官方服务:
资源简介:
This dataset provides a set of 18 load profiles with an hourly temporal resolution that represent main industrial and tertiary sectors in France for the year 2018.The ELMAS dataset is derived from a total of 55,730 consumption time series initially split into 424 business sectors and three levels of subscribed capacity. The customer’s field of activity follows the Statistical Classification of Economic Activities in the European Community (NACE), which is a four-digit industry standard classification used in the European Union composed of 21 sections, 88 divisions, 272 groups, and 615 classes. For anonymity concerns, the initial times series are averaged according to their NACE coding and level of subscribed capacity.<br><br>Discrepancies between the temporal patterns of customers that belong to the same NACE section highlight the need to resort to another clustering approach. Thus, a K-means algorithm is used to gather the business groups sharing similar temporal patterns into 18 clusters. The resulting clustering shows that numerous NACE sections are scattered over various clusters, which increases the global heterogeneity of the clustering while spoiling the interpretation. The proportion of these dispersed NACE classes in terms of annual energy consumption remains low, which suggests that a manual reorganisation has little impact on the global consistency of the clusters. This manual reclassification is conducted in such a way that scattered NACE classes are gathered in the cluster that possesses the highest share of the considered NACE section. The energy consumption time series dataset represents a limited panel composed of 55,730 customers, which may bias the output load profiles in comparison with the whole French panel of industrial and tertiary customers. To fill this gap, Enedis provides the annual energy consumption of a wider range of customers for the year 2019. This annual energy consumption dataset is used to generate weights implemented in the clustering approach and to derive weighted average time series for the clusters.
本数据集包含18条小时级时间分辨率的用电负荷曲线,代表法国2018年主要工业与第三产业行业。ELMAS数据集源自总计55730条消费时间序列,原始数据最初被划分为424个业务领域与三级报装容量等级。用户的活动领域遵循《欧洲经济活动统计分类(NACE,Statistical Classification of Economic Activities in the European Community)》,该分类是欧盟采用的四位数字行业标准分类,下设21个门类、88个大类、272个中类及615个小类。出于匿名性考量,原始时间序列依据其NACE编码与报装容量等级进行平均聚合处理。
同一NACE门类下的用户其时间用电模式存在差异,这凸显了采用其他聚类方法的必要性。因此,本研究采用K-means算法,将具有相似时间用电模式的业务群体聚合为18个簇。最终的聚类结果显示,多个NACE门类分散于不同簇中,这虽提升了聚类的全局异质性,却削弱了结果的可解释性。这些分散的NACE小类的年度能耗占比仍然较低,这意味着手动重组对各簇的全局一致性几乎无影响。本次手动重分类遵循如下规则:将分散的NACE小类归入其所归属NACE门类占比最高的簇中。
本能耗时间序列数据集仅包含55730名用户的有限面板数据集,相较于法国全部工业与第三产业用户群体,该样本可能会对生成的用电负荷曲线带来偏差。为弥补这一数据缺口,Enedis公司提供了2019年更广泛用户群体的年度能耗数据。该年度能耗数据集被用于生成聚类算法中的权重系数,并推导得到各簇的加权平均时间序列。
提供机构:
figshare
创建时间:
2023-09-03
搜集汇总
数据集介绍

背景与挑战
背景概述
ELMAS数据集包含18个代表2018年法国主要工业和第三产业部门的负荷曲线,时间分辨率为小时。数据集基于55,730个消费时间序列,使用NACE分类和K-means算法进行聚类分析,并通过加权平均处理提高了数据的代表性和一致性。
以上内容由遇见数据集搜集并总结生成



