five

2na-97/FAKER-Air

收藏
Hugging Face2026-03-26 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/2na-97/FAKER-Air
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: mit task_categories: - time-series-forecasting tags: - climate - code pretty_name: FAKER-Air size_categories: - 10B<n<100B --- # FAKER-Air Dataset This repository contains the dataset used in **FAKER-Air**, consisting of ground-truth air quality observations interpolated onto a grid and CMAQ reanalysis data tailored for East Asia. - **Paper**: [Real-Time Long Horizon Air Quality Forecasting via Group-Relative Policy Optimization](https://www.arxiv.org/abs/2511.22169) - **Code**: [GitHub Repository](https://github.com/kaist-cvml/FAKER-Air) ## Dataset Structure The data is organized into two main directories inside `data/`: ### 1. Observations (`data/obs`) Ground-truth station data interpolated onto the CMAQ 27km grid. - **Format**: `.npz` (Compressed NumPy archives) - **Naming**: `YYYYMMDDHH_obs.npz` (e.g., `2016010100_obs.npz`) - **Content**: Contains arrays for pollutant concentrations (PM2.5, PM10, etc.) on the grid. - **Total Files**: ~74,000 files (Hourly data from 2016 to 2023+). ### 2. CMAQ Reanalysis (`data/cmaq`) Physics-based model outputs (Community Multiscale Air Quality). - **Format**: `.npy` and `.json` - **Structure**: `YYYY/MM/DD/NIER_27_01/` - **Files**: - `*_x_conc.npy`: Concentration fields. - `*_x_metcro2d.npy`: 2D Meteorological fields. - `*_x_metcro3d.npy`: 3D Meteorological fields. - `*_meta.json`: Metadata. ## How to Use You can download specific parts of the dataset using the `huggingface_hub` Python library. ### Prerequisites ```bash pip install huggingface_hub numpy ```` ### Download & Load Example ```python from huggingface_hub import snapshot_download import numpy as np import os # 1. Download the dataset (It will cache data locally) # To download only specific years or folders, use `allow_patterns`. local_dir = snapshot_download( repo_id="2na-97/FAKER-Air", repo_type="dataset", allow_patterns=[ "data/obs/2023*.npz", # Example: Only download OBS for 2023 "data/cmaq/2023/**" # Example: Only download CMAQ for 2023 ] ) print(f"Data downloaded to: {local_dir}") # 2. Load an OBS file obs_path = os.path.join(local_dir, "data/obs/2023010100_obs.npz") if os.path.exists(obs_path): data = np.load(obs_path) print("Keys in OBS:", data.files) # Example access: data['pm25'] # 3. Load a CMAQ file cmaq_path = os.path.join(local_dir, "data/cmaq/2023/01/01/NIER_27_01/20230101_x_conc.npy") if os.path.exists(cmaq_path): cmaq_data = np.load(cmaq_path) print("CMAQ Shape:", cmaq_data.shape) ``` ## Citation ```bibtex @misc{kang2026realtimelonghorizonair, title={Real-Time Long Horizon Air Quality Forecasting via Group-Relative Policy Optimization}, author={Inha Kang and Eunki Kim and Wonjeong Ryu and Jaeyo Shin and Seungjun Yu and Yoon-Hee Kang and Seongeun Jeong and Eunhye Kim and Soontae Kim and Hyunjung Shim}, year={2026}, eprint={2511.22169}, archivePrefix={arXiv}, primaryClass={cs.CV}, url={https://arxiv.org/abs/2511.22169}, } ```
提供机构:
2na-97
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作