five

High-Quality Daily PM2.5 Datasets for India at 10 km Resolution (Version 2)

收藏
NIAID Data Ecosystem2026-05-02 收录
下载链接:
https://zenodo.org/record/10807118
下载链接
链接失效反馈
官方服务:
资源简介:
If you use this dataset in your research/work, please cite the following paper: Kawano, Ayako, et al. "Improved daily PM2.5 estimates in India reveal inequalities in recent enhancement of air quality." Science Advances 11.4 (2025): eadq1071. DOI: 10.1126/sciadv.adq1071 Thank you for acknowledging our work! ----------------------------------------------   Open-source daily fine particulate matter (PM2.5) datasets at a 10 km resolution for India from 2005 to 2023, using a region-specific two-stage machine learning model carefully validated on held-out monitor data that it was not trained on. Our model demonstrates robust out-of-sample performance, substantially outperforming existing publicly-available monthly PM2.5 datasets.   To take advantage of both the longer available time series of Aerosol Optical Depth (AOD) data and information from newer sensors such as TROPOspheric Monitoring Instrument (TROPOMI), we developed two separate machine learning models - the "Full model" and the "AOD model".   Full model: Predictive performance (spatial cross-validation): R2 value of 0.67, RMSE of 27.79 μg/m3 Input features: Moderate Resolution Imaging Spectroradiometer (MODIS) AOD and TROPOMI satellite inputs along with other remote sensing data Daily PM2.5 predictions for: July 10, 2018 - September 30, 2023 AOD model:  Predictive performance (spatial cross-validation): R2 value of 0.64, RMSE of 32.08 μg/m3 Input features: all inputs except TROPOMI used for the Full model Daily PM2.5 predictions for: January 1, 2005 - September 30, 2023   Please note that we employed spatial cross-validation (CV) rather than more conventional random CV to be responsible for predicting daily PM2.5 concentrations for locations without air quality monitors across India. When the above Full model was evaluated using 10-fold random CV, it showed notably higher performance (R2 of 0.85 and RMSE of 18.48 μg/m3). This highlights the potential of random CV to overstate model performance on critical real-world applications.   Code and source data needed to replicate the results have been also deposited.
创建时间:
2025-01-30
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作