Dataset: Efficient improvement for water quality analysis with large amount of missing data
收藏doi.org2022-07-26 更新2025-03-25 收录
下载链接:
http://doi.org/10.17632/8y42cbc7h8.1
下载链接
链接失效反馈官方服务:
资源简介:
Water is vital for life and local water pollution can damage the environment and affect human health. Governments and private institutions monitor and regulate water quality to protect the environment and populations. The consequences of pollution can reach far and wide, costing companies significant amounts in cleanup costs and loss of reputation. Most countries have official accredited laboratories and sampling teams that use varied technology, global expertise and local knowledge to provide water quality monitoring for different types of water and different and varied sampling locations. However, one of the main problems associated with monitoring and assessing water quality and meeting minimum standards of potability or usability is the analysis of samples based on local data. The problem lies in the fact that in many cases the data, due to the methodology or technique used or the expertise of the human resource that handles the samples, ends up configured in sets that have a large amount of missing information or data without information. This implies a problem depending on the analysis to be carried out. If you want to estimate a water quality index based on the samples, then you may have biased calculations due to the loss of information.
This dataset has been used for the generation of the manuscript: Efficient improvement for water quality analysis with large amount of missing data. D. Sierra-Porta,M. Tobón-Ospino. This manuscript is being submitted to Sustainable Production and Consumption (2022 Elsevier), Publication of the Institution of Chemical Engineers.
水对于生命至关重要,而地方性水污染则会破坏生态环境并影响人类健康。各国政府及私人机构均负责监测与规范水质,以保护生态环境及民众福祉。污染的后果往往波及广泛,导致企业承担巨额的清洁费用及声誉损失。大多数国家设有官方认可的实验室及采样团队,他们运用多样化的技术、全球专业知识与地方性知识,为不同类型的水体及多样化的采样地点提供水质监测服务。然而,监测与评估水质、达到最低的适饮或可用标准的主要问题之一,便是对基于当地数据的样本进行分析。问题在于,由于方法论或技术使用不当,或处理样本的人类资源专业知识不足,许多情况下数据最终以大量缺失信息或信息不全的集合形式呈现,这无疑会根据分析的不同而带来一系列问题。若基于样本估算水质指数,则可能因信息的丢失而导致计算偏差。本数据集已被用于生成以下论文:《大量缺失数据下水质分析的效率提升》,作者:D. Sierra-Porta, M. Tobón-Ospino。该论文现正提交至《可持续生产与消费》(2022年,Elsevier出版社,化学工程师学会出版)。
提供机构:
doi.org



