five

rachitgoyell/vayu-raw

收藏
Hugging Face2026-03-23 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/rachitgoyell/vayu-raw
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: cc-by-4.0 language: - en tags: - air-quality - india - cpcb - pollution - environment - aqi pretty_name: VAYU — Raw CPCB Air Quality Data (India) size_categories: - 1M<n<10M --- # VAYU — Raw CPCB Air Quality Data Raw sensor data collected from India's Central Pollution Control Board (CPCB) Continuous Ambient Air Quality Monitoring Stations (CAAQMS). ## Contents | File | Rows | Description | |---|---|---| | `aqi_india_38cols_knn_final.csv` | 842,160 | Primary dataset — hourly pollutant readings across 29 cities, KNN-imputed | | `*_AQIBulletins.csv` (277 files) | ~300,000 | Daily AQI bulletins, one file per city, 277 cities total | ## Key Facts - **Cities:** 29 cities in primary file, 277 cities across bulletin files - **Time range:** 2015 – 2024 (hourly) - **Pollutants:** PM2.5, PM10, NO2, SO2, CO, O3 - **Known issue:** Sentinel value `999` used by CPCB to indicate sensor error — not a real reading, must be cleaned before use - **Total raw files scanned:** 299 (CSV + XLSX) ## How to Use This dataset is the input to the VAYU data cleaning pipeline. Run `vayu_step1_setup.ipynb` → `vayu_step2_cleaning.ipynb` to produce the cleaned version, or load the pre-cleaned version directly from [vayu-cleaned](https://huggingface.co/datasets/rachitgoyell/vayu-cleaned). ## Related Repository Cleaned and model-ready version: [rachitgoyell/vayu-cleaned](https://huggingface.co/datasets/rachitgoyell/vayu-cleaned)
提供机构:
rachitgoyell
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作