Patterns discovery dataset for particulate matter (pm2.5) pollution trends in Japan
收藏DataONE2024-12-12 更新2025-04-26 收录
下载链接:
https://search.dataone.org/view/sha256:eff75034aa96c0a2fa3d66da79aec345cad726aa4caa46c831b041604af97198
下载链接
链接失效反馈官方服务:
资源简介:
Air pollution presents a significant environmental risk, impacting human health, accelerating climate change, and disrupting ecosystems. The main aim of air pollution research is to pinpoint the most harmful pollutants identified in previous studies and to map regions exposed to high pollution levels. This study introduces a large-scale, high-quality dataset to advance the analysis of PM2.5 pollution and reveal hidden patterns through pattern mining techniques. The dataset covers five years of hourly PM2.5 measurements collected from approximately 1,900 sensors across Japan, sourced from the Ministry of the Environment's Soramame platform. This platform offers hourly pollutant records, downloadable as monthly raw data files. The unorganised raw data files are systematically organised and stored in database tables using an Entity-Relationship (ER) schema.
The primary objective of this dataset is to aid in developing and validating pattern mining models, enabling the accurate detection of..., The air pollution data was collected from Japanâs Soramame platform, which provides hourly updates on pollutant levels nationwide. The data files were collected from January 1, 2018, 01:00:00, to April 25, 2023, 22:00:00, covering records from approximately 1,900 sensors stationed in various locations across Japan. These files are initially unorganised in CSV format and require systematic organisation by year, month, time, sensor, and pollutant type. To maintain data integrity, we structured the dataset using an Entity-Relationship (ER) schema within a PostgreSQL database, comprising two main tables: the Sensor table (storing sensor name, ID, address, and location) and the Observations table (recording pollutant types and their values). A detailed step-by-step process is provided in the README, and this organization created a consolidated CSV file containing PM2.5 levels, timestamps, and sensor details., , # AEROS PM2.5 Dataset
## Overview
The **AEROS PM2.5 Dataset** provides a comprehensive collection of hourly PM2.5 measurements recorded over a period of five years from sensors located across Japan. This dataset is a valuable resource for studying air quality trends, pollution patterns, and environmental health impacts.
---
## Dataset Description
### File Information
* **File Name:** `FINAL_DATASET.csv`
* **Content:** Hourly PM2.5 measurements collected from sensors located in Japan over five years.
### Structure
The dataset includes the following columns:
1. **Timestamps**: The date and time when the measurement was recorded.
2. **Sensor Location IDs**: Unique identifiers for the sensor locations.
3. **PM2.5 Values (µg/m³)**: The recorded PM2.5 concentration at a specific timestamp and location.
### Units
* **PM2.5 Values:** Measured in micrograms per cubic meter (µg/m³).
---
## Notes on Data
* **Empty Cells**: Represent instances where no PM2.5 data was recorded by the s...
大气污染是一类严重的环境风险,会危害人类健康、加剧气候变化并破坏生态系统。大气污染研究的核心目标,是精准定位既往研究中确认的高危害污染物,并绘制高污染暴露区域的分布图。本研究发布了一套大规模高质量数据集,以推动PM2.5污染分析工作,并通过模式挖掘技术揭示隐藏的污染规律。
该数据集涵盖了日本全国约1900个传感器采集的5年逐小时PM2.5监测数据,数据来源于日本环境省的Soramame平台。该平台提供逐小时污染物记录,可按月度原始数据文件形式下载。原始零散的数据文件已通过实体-关系(Entity-Relationship, ER)模式被系统整理并存储至数据库表中。
本数据集的核心目标是助力模式挖掘模型的开发与验证,实现对……的精准检测。本次空气污染数据采集自日本Soramame平台,该平台可实时更新全国范围内的污染物浓度水平。数据采集时段为2018年1月1日01:00:00至2023年4月25日22:00:00,覆盖了日本各地约1900个传感器的监测记录。这些文件最初为未整理的CSV格式,需按年份、月份、时间、传感器及污染物类型进行系统化整理。为保障数据完整性,我们采用实体-关系(ER)模式在PostgreSQL数据库中构建了该数据集,包含两张核心数据表:传感器表(存储传感器名称、ID、地址与位置信息)以及观测表(记录污染物类型及其浓度值)。README文件中提供了详细的分步操作流程,本次整理还生成了一份整合后的CSV文件,其中包含PM2.5浓度、时间戳及传感器相关信息。
# AEROS PM2.5数据集
## 概述
**AEROS PM2.5数据集** 收录了日本全国传感器在5年周期内采集的逐小时PM2.5监测数据,是研究空气质量趋势、污染模式及环境健康影响的宝贵资源。
---
## 数据集说明
### 文件信息
* **文件名:** `FINAL_DATASET.csv`
* **内容:** 日本各地传感器在5年周期内采集的逐小时PM2.5监测数据。
### 数据结构
本数据集包含以下列:
1. **时间戳:** 监测记录生成的日期与时间。
2. **传感器位置ID:** 传感器位置的唯一标识符。
3. **PM2.5浓度(µg/m³):** 特定时间戳与位置下记录的PM2.5浓度值。
### 单位
* **PM2.5浓度:** 以微克/立方米(µg/m³)为单位进行计量。
---
## 数据说明
* **空单元格:** 代表对应传感器未记录PM2.5数据的情况……
创建时间:
2024-12-12



