环境空气质量预测模型训练数据集
收藏合肥数据要素流通平台2025-12-18 更新2025-12-20 收录
下载链接:
https://www.bigdatadex.com.cn/dataCirculation/listMoreb/details?shopId=2001488958309605378&commodityType=2
下载链接
链接失效反馈官方服务:
资源简介:
该数据集由环境空气质量数据和ERA5大气在分析数据组成。经过严格的清洗和治理,能够适配不同预测时长的模型训练需求。将其直接应用到空气质量预测模型的训练中,既能让模型学习污染物浓度的本地演变特征,又能捕捉宏观气象条件对污染物迁移转化的驱动作用,为模型提供高质量、多维度的训练样本,显著提升模型对重污染过程、跨区域传输事件的预测准确性,最终支撑短期、中期乃至长期的环境空气质量精准预报。
该数据集主要包括: 1、环境空气质量数据集:含2014-今(近11年)全国1600+国控站点、300+城市的环境空气小时及日数据,可直观反映不同城市在特定时间段内的空气质量状况,为AI模型在环境预测、污染溯源、智能预警等场景下的建模与优化提供高质量、可计算的数据支撑。 2、ERA5大气再分析数据集:全球观测数据、数值模型和物理参数化方案,通过数据同化和数值模拟的技术,对过去数十年(1940年至今)的天气状况进行再构建和模拟,从而生成了高时空分辨率的大气和地表变量数据,包括温度、湿度、风速、降水、云量、地表辐射、地表温度等,可用于气候研究、天气分析、气候模型验证、环境监测等众多应用领域。
This dataset is composed of ambient air quality data and ERA5 atmospheric reanalysis data. After rigorous cleaning and curation, it can meet the training requirements of models with varying prediction durations. When directly applied to the training of air quality prediction models, it allows the models to learn the local evolutionary characteristics of pollutant concentrations, as well as capture the driving effects of macroscale meteorological conditions on the transport and transformation of pollutants. By providing high-quality, multi-dimensional training samples for the models, it significantly improves the prediction accuracy of the models for heavy pollution episodes and cross-regional transport events, ultimately supporting accurate environmental air quality forecasting across short-term, medium-term, and even long-term time scales.
This dataset mainly includes two parts:
1. 'Environmental Air Quality Dataset': It contains hourly and daily ambient air quality data from over 1,600 national monitoring stations and more than 300 cities across China from 2014 to the present (nearly 11 years). This data can intuitively reflect the air quality status of different cities during specific time periods, providing high-quality, computable data support for AI model modeling and optimization in scenarios such as environmental prediction, pollution source apportionment, and intelligent early warning.
2. 'ERA5 Atmospheric Reanalysis Dataset': Based on global observational data, numerical models, and physical parameterization schemes, it reconstructs and simulates weather conditions over the past decades (from 1940 to the present) through data assimilation and numerical simulation technologies, generating high spatiotemporal resolution atmospheric and surface variable datasets including temperature, humidity, wind speed, precipitation, cloud cover, surface radiation, and surface temperature. It can be used in numerous application fields such as climate research, weather analysis, climate model validation, and environmental monitoring.
提供机构:
天津天融环境科技发展有限公司
创建时间:
2025-12-18
搜集汇总
数据集介绍

背景与挑战
背景概述
该数据集是一个用于环境空气质量预测模型训练的综合数据集合,由两部分组成:一是2014年至今全国1600多个国控站点的环境空气质量小时及日数据,覆盖300多个城市;二是1940年至今的ERA5全球大气再分析数据,包含温度、湿度等多种气象变量。经过严格清洗治理,该数据集能适配不同预测时长的模型训练需求,帮助模型学习污染物演变特征和气象驱动作用,从而显著提升对重污染过程和跨区域传输事件的预测准确性,支撑短期到长期的精准空气质量预报。
以上内容由遇见数据集搜集并总结生成



