ML-CNPM2.5
收藏DataCite Commons2025-04-27 更新2025-04-16 收录
下载链接:
https://www.scidb.cn/detail?dataSetId=b3d0d82da3844b8a81731554ec802901
下载链接
链接失效反馈官方服务:
资源简介:
The features possibly affecting ground-based PM2.5 from 2014 to 2023 in China were collected to make up our first version of the ML-CNPM2.5. Thanks to our filling and calibrating methods, over 5 million samples (5,076,608) have been obtained, which is so more PM2.5 samples that have not been covered in previous studies, to our knowledge. To train and assess different models in terms of primary and higher accuracy ML-based models, the dataset including unfilled AOD, with 1790210-line records, is also issued since filled AOD always shows lower accuracy than unfilled. To distinguish the two datasets, the filled AOD dataset is named ML-CNPM2.5-A and the unfiled is named ML-CNPM2.5-B. There are twenty-four features contained in the ML-CNPM2.5 A, whereas twenty-three features in the ML-CNPM2.5-B. Most of the features directly affect or indirectly affect ground-based PM2.5 estimating using remote sensing and ML technology, thereby being widely used as the input of ML-based models. The distribution of each feature in the ML-CNPM2.5-A (ML-CNPM2.5-B) is revealed in Fig. 1 (Fig. 2). The Figures intuitively demonstrate each feature’s range of values, including median, quartile, and outlier. For example, the distribution of Terra MAIAC AOD is changed plainly after being calibrated, i.e., from the range of 0-8 calibrated to the range of 0-3, which is more realistic. The discrete features, including year, month, day, Doy and LUC, show even distribution in their range of values, indicating the equilibrium and comprehensiveness of our sample dataset. Detailed information about these features is listed in Table 2 (Table S1) for ML-CNPM2.5-A (CNPM2.5-B). Overall, our sample dataset includes commend features used widely in estimating PM2.5, with high-volume and comprehensive records, as big data ensures the training and validation of different models.
提供机构:
Science Data Bank
创建时间:
2024-06-13



