five

Hybrid Prophet-XGBoost Forecasting for Urban Air Quality with Real-Time Data Integration For Chennai

收藏
NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://data.mendeley.com/datasets/br5dyxm7c6
下载链接
链接失效反馈
官方服务:
资源简介:
This research aims to forecast short-term urban air quality by integrating meteorological variables and pollutant concentration data into a hybrid machine learning framework. The hypothesis is that using both historical pollutant data and weather features enhances predictive accuracy across different pollutant types. Historic pollutant data (2022–2024) was collected from the WAQI (World Air Quality Index) historic dashboard, cleaned, and imputed for missing or anomalous values. This dataset includes daily values for PM₂.₅, PM₁₀, NO₂, SO₂, and O₃ from six key monitoring stations in Chennai. Supplementary weather and CAMS air quality data were sourced using the Open-Meteo APIs: https://api.open-meteo.com/v1/forecast for daily weather https://air-quality-api.open-meteo.com/v1/air-quality for CAMS-based pollutant history These datasets were combined and processed to ensure daily granularity, using linear interpolation for imputation and z-score-based outlier capping. The air quality forecast was generated for a 7-day horizon using a hybrid approach: Prophet (for seasonality), XGBoost (for lag-based learning), and ETS (for fallback modeling). AQI was computed based on Indian CPCB sub-index breakpoints, identifying both the AQI level and dominant pollutant each day. To evaluate performance, back-testing was done on actual vs. predicted pollutant values. Results show high model accuracy, particularly for gaseous pollutants. Based on MAPE and sMAPE: O₃: 95.22% accuracy SO₂: 93.04% accuracy PM₁₀: 80.28% accuracy PM₂.₅: 76.78% accuracy NO₂: 65.74% accuracy The model successfully captured pollutant dynamics and generated reliable AQI forecasts, suitable for environmental monitoring and early warning applications.
创建时间:
2025-06-30
二维码
社区交流群
二维码
科研交流群
商业服务