five

PMHMs_dataset

收藏
DataCite Commons2024-02-21 更新2024-08-19 收录
下载链接:
https://figshare.com/articles/dataset/PMHMs_dataset/24762513/3
下载链接
链接失效反馈
官方服务:
资源简介:
The absence of nationwide distribution data regarding heavy metal emissions in the atmosphere poses a significant constraint in environmental research and public health assessment. In response to the critical data deficiency, we have established a dataset covering Cr, Cd, As, and Pb emissions in the atmosphere (PMHMs) across 367 municipalities in China. Initially, we collected PMHMs data and covariates such as industrial emissions, vehicle emissions, meteorological variables, among other 10 indicators. Following this, nine machine learning models, including Linear Regression (LR), Ridge, Bayesian Ridge (Bayesian), K-Neighbors Regressor (KNN), MLP Regressor (MLP), Random Forest Regressor (RF), LGBM Regressor (LGBM), Lasso, and ElasticNet, were assessed using coefficient of determination (R2), root-mean-square error (RMSE) and Mean Absolute Error (MAE) on the testing dataset. RF and LGBM models were chosen, due to their favorable predictive performance (R<sup>2</sup>: 0.58–0.84, lower RMSE/MAE), confirming their robustness in modelling. This dataset serves as a valuable resource for informing environmental policies, monitoring air quality, conducting environmental assessments, and facilitating academic research.

大气重金属排放的全国性分布数据缺失,已成为制约环境研究与公共卫生评估的关键瓶颈。针对这一关键数据缺失问题,我们构建了覆盖中国367个地市的大气重金属(PMHMs)排放数据集。本研究首先收集了大气重金属排放数据,以及工业排放、机动车排放、气象变量等共计10项协变量指标。随后,我们在测试集上采用决定系数(R²)、均方根误差(RMSE)与平均绝对误差(MAE),对9种机器学习模型展开评估,包括线性回归(Linear Regression,LR)、岭回归(Ridge)、贝叶斯岭回归(Bayesian Ridge,Bayesian)、K近邻回归器(K-Neighbors Regressor,KNN)、多层感知器回归器(MLP Regressor,MLP)、随机森林回归器(Random Forest Regressor,RF)、轻量梯度提升机回归器(LGBM Regressor,LGBM)、套索回归(Lasso)与弹性网回归(ElasticNet)。最终选取随机森林回归器(RF)与轻量梯度提升机回归器(LGBM),二者表现出优异的预测性能(决定系数R²为0.58~0.84,均方根误差与平均绝对误差更低),证实了其在建模过程中的稳健性。本数据集可为环境政策制定、空气质量监测、环境评估以及学术研究提供宝贵的数据支撑。
提供机构:
figshare
创建时间:
2024-01-26
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作