five

基于智水孪生的排水管网流量液位监测与评价数据集

收藏
天津市数据知识产权登记平台2025-10-11 更新2025-10-23 收录
下载链接:
https://dengji.tjippc.cn/xxgg_nr?id=849be7a5-fdf6-4379-81ef-0f7820d3b6ec
下载链接
链接失效反馈
官方服务:
资源简介:
1、数据去重与缺失值、异常值识别:(1)对每个设备指标的数据(流量或液位),首先进行去重处理,保证每个设备在同一时间点只有一条记录;(2)对数据进行批量校核,识别缺失值和异常值,异常值通过多规则判定,包括连续零值、超限、离群及突变跳跃;(3)对数据的有效性进行人工复核,复核结果通过 validate 字段标识,其中,0表示无效,含缺失及人工复核的真实异常,1表示有效。 2、多维度加权插补:对于无效,使用三维信息进行加权插补:(1)时间历史参考:取缺失点之前的3个有效时间点的数值,计算平均值作为时间参考值。该部分考虑数据的时间连续性和趋势。(2)空间邻近参考:根据设备指标间的相关性矩阵,选择与目标设备相关性最高的2个设备作为空间邻近点。将这些邻近设备在同一时间点的有效值取平均,作为空间参考值。该部分充分利用不同设备之间的空间关联性。(3)降雨因素参考:引入环境因素降雨量对指标的影响,获取对应时间点的降雨数据作为参考。该部分可捕捉降雨变化对监测指标的瞬时影响。 3、加权融合计算:将时间参考值、空间参考值和降雨参考值按可调权重加权平均,得到最终插补值。权重分别为w_time=0.4,w_space=0.4,w_env=0.2。若某一参考值缺失,则自动调整加权,保证插补值计算有效。如果所有参考值均缺失,则使用前一个有效值填充,确保数据连续性。

1、Data Deduplication, Missing Value and Outlier Identification: (1) For data of each device indicator (flow or liquid level), deduplication processing is first performed to ensure that each device has only one record at the same time point. (2) Batch verification is conducted on the data to identify missing values and outliers. Outliers are determined through multiple rules, including consecutive zero values, over-limitation, outliers and abrupt jumps. (3) Manual review is carried out on the validity of the data, and the review results are identified by the "validate" field, where 0 indicates invalid (including missing values and real anomalies confirmed by manual review) and 1 indicates valid. 2、Multi-dimensional Weighted Imputation: For invalid data, weighted imputation is performed using three-dimensional information: (1) Temporal history reference: Take the average value of the valid values of the 3 valid time points before the missing point as the temporal reference value. This part considers the temporal continuity and trend of the data. (2) Spatial proximity reference: According to the correlation matrix between device indicators, select the 2 devices with the highest correlation with the target device as spatial proximity points. Take the average of the valid values of these proximity devices at the same time point as the spatial reference value. This part makes full use of the spatial correlation between different devices. (3) Rainfall factor reference: Introduce the impact of the environmental factor rainfall on the indicators, and obtain the rainfall data at the corresponding time point as a reference. This part can capture the instantaneous impact of rainfall changes on monitoring indicators. 3、Weighted Fusion Calculation: The temporal reference value, spatial reference value and rainfall reference value are weighted and averaged with adjustable weights to obtain the final imputation value. The weights are w_time=0.4, w_space=0.4, w_env=0.2 respectively. If a certain reference value is missing, the weights are automatically adjusted to ensure the validity of the imputation value calculation. If all reference values are missing, the previous valid value is used for filling to ensure data continuity.
提供机构:
中国市政工程华北设计研究总院有限公司
创建时间:
2025-09-30
搜集汇总
数据集介绍
main_image_url
背景与挑战
背景概述
该数据集是一个针对水务行业的排水管网监测数据集,包含605万条流量和液位监测记录,每年更新一次,用于城市内涝应急防汛和排水调度优化。数据集采用多维度加权插补算法处理数据缺失和异常,确保数据质量,适用于供排水管理部门和科研机构进行系统评估和模型率定。
以上内容由遇见数据集搜集并总结生成
二维码
社区交流群
二维码
科研交流群
商业服务