dazhiyang/bsrn-merra2
收藏Hugging Face2026-04-17 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/dazhiyang/bsrn-merra2
下载链接
链接失效反馈官方服务:
资源简介:
---
language: en
license: mit
configs:
- config_name: abs
data_files:
- split: abs
path: abs/*.parquet
- config_name: ale
data_files:
- split: ale
path: ale/*.parquet
- config_name: asp
data_files:
- split: asp
path: asp/*.parquet
- config_name: bar
data_files:
- split: bar
path: bar/*.parquet
- config_name: ber
data_files:
- split: ber
path: ber/*.parquet
- config_name: bil
data_files:
- split: bil
path: bil/*.parquet
- config_name: bon
data_files:
- split: bon
path: bon/*.parquet
- config_name: bos
data_files:
- split: bos
path: bos/*.parquet
- config_name: bou
data_files:
- split: bou
path: bou/*.parquet
- config_name: brb
data_files:
- split: brb
path: brb/*.parquet
- config_name: bud
data_files:
- split: bud
path: bud/*.parquet
- config_name: cab
data_files:
- split: cab
path: cab/*.parquet
- config_name: cam
data_files:
- split: cam
path: cam/*.parquet
- config_name: cap
data_files:
- split: cap
path: cap/*.parquet
- config_name: car
data_files:
- split: car
path: car/*.parquet
- config_name: clh
data_files:
- split: clh
path: clh/*.parquet
- config_name: cnr
data_files:
- split: cnr
path: cnr/*.parquet
- config_name: coc
data_files:
- split: coc
path: coc/*.parquet
- config_name: daa
data_files:
- split: daa
path: daa/*.parquet
- config_name: dar
data_files:
- split: dar
path: dar/*.parquet
- config_name: dom
data_files:
- split: dom
path: dom/*.parquet
- config_name: dra
data_files:
- split: dra
path: dra/*.parquet
- config_name: dwn
data_files:
- split: dwn
path: dwn/*.parquet
- config_name: e13
data_files:
- split: e13
path: e13/*.parquet
- config_name: ena
data_files:
- split: ena
path: ena/*.parquet
- config_name: eur
data_files:
- split: eur
path: eur/*.parquet
- config_name: flo
data_files:
- split: flo
path: flo/*.parquet
- config_name: fpe
data_files:
- split: fpe
path: fpe/*.parquet
- config_name: fua
data_files:
- split: fua
path: fua/*.parquet
- config_name: gan
data_files:
- split: gan
path: gan/*.parquet
- config_name: gcr
data_files:
- split: gcr
path: gcr/*.parquet
- config_name: gim
data_files:
- split: gim
path: gim/*.parquet
- config_name: gob
data_files:
- split: gob
path: gob/*.parquet
- config_name: gur
data_files:
- split: gur
path: gur/*.parquet
- config_name: gvn
data_files:
- split: gvn
path: gvn/*.parquet
- config_name: how
data_files:
- split: how
path: how/*.parquet
- config_name: ilo
data_files:
- split: ilo
path: ilo/*.parquet
- config_name: ino
data_files:
- split: ino
path: ino/*.parquet
- config_name: ish
data_files:
- split: ish
path: ish/*.parquet
- config_name: iza
data_files:
- split: iza
path: iza/*.parquet
- config_name: kwa
data_files:
- split: kwa
path: kwa/*.parquet
- config_name: lau
data_files:
- split: lau
path: lau/*.parquet
- config_name: ler
data_files:
- split: ler
path: ler/*.parquet
- config_name: lin
data_files:
- split: lin
path: lin/*.parquet
- config_name: lmp
data_files:
- split: lmp
path: lmp/*.parquet
- config_name: lrc
data_files:
- split: lrc
path: lrc/*.parquet
- config_name: lyu
data_files:
- split: lyu
path: lyu/*.parquet
- config_name: man
data_files:
- split: man
path: man/*.parquet
- config_name: mnm
data_files:
- split: mnm
path: mnm/*.parquet
- config_name: nau
data_files:
- split: nau
path: nau/*.parquet
- config_name: new
data_files:
- split: new
path: new/*.parquet
- config_name: nya
data_files:
- split: nya
path: nya/*.parquet
- config_name: ohy
data_files:
- split: ohy
path: ohy/*.parquet
- config_name: pal
data_files:
- split: pal
path: pal/*.parquet
- config_name: par
data_files:
- split: par
path: par/*.parquet
- config_name: pay
data_files:
- split: pay
path: pay/*.parquet
- config_name: psu
data_files:
- split: psu
path: psu/*.parquet
- config_name: ptr
data_files:
- split: ptr
path: ptr/*.parquet
- config_name: qiq
data_files:
- split: qiq
path: qiq/*.parquet
- config_name: reg
data_files:
- split: reg
path: reg/*.parquet
- config_name: rlm
data_files:
- split: rlm
path: rlm/*.parquet
- config_name: run
data_files:
- split: run
path: run/*.parquet
- config_name: sap
data_files:
- split: sap
path: sap/*.parquet
- config_name: sbo
data_files:
- split: sbo
path: sbo/*.parquet
- config_name: sel
data_files:
- split: sel
path: sel/*.parquet
- config_name: sms
data_files:
- split: sms
path: sms/*.parquet
- config_name: son
data_files:
- split: son
path: son/*.parquet
- config_name: sov
data_files:
- split: sov
path: sov/*.parquet
- config_name: spo
data_files:
- split: spo
path: spo/*.parquet
- config_name: sxf
data_files:
- split: sxf
path: sxf/*.parquet
- config_name: syo
data_files:
- split: syo
path: syo/*.parquet
- config_name: tam
data_files:
- split: tam
path: tam/*.parquet
- config_name: tat
data_files:
- split: tat
path: tat/*.parquet
- config_name: tik
data_files:
- split: tik
path: tik/*.parquet
- config_name: tir
data_files:
- split: tir
path: tir/*.parquet
- config_name: tor
data_files:
- split: tor
path: tor/*.parquet
- config_name: xia
data_files:
- split: xia
path: xia/*.parquet
- config_name: yus
data_files:
- split: yus
path: yus/*.parquet
tags:
- solar
- radiation
- bsrn
pretty_name: BSRN MERRA-2 Atmospheric Inputs
---
<!-- This file is uploaded as the Hugging Face dataset README; keep content user-facing (no internal maintainer or upload-only notes). -->
# BSRN MERRA-2 Atmospheric Inputs
Point-extracted MERRA-2 reanalysis data for [Baseline Surface Radiation Network (BSRN)](https://bsrn.awi.de/) stations. These parquet files provide atmospheric and aerosol inputs for the **REST2** clear-sky radiation model [2].
## Dataset Description
Each file contains hourly MERRA-2 variables [1] at a single BSRN station location. **Extraction is performed via Google Earth Engine (GEE)** from NASA's 0.5° × 0.625° global grid. Data are aligned to the MERRA-2 grid cell nearest the station coordinates.
### File Structure
Station folders use the **lowercase** BSRN three-letter code (e.g. `ber`, `qiq`, or `spo`), with the exception of `e13`.
```
{station}/
{station}{MM}{YY}_merra2.parquet # One file per month
```
Examples:
- `qiq/qiq0124_merra2.parquet` — QIQ, January 2024
- `ber/ber0325_merra2.parquet` — BER, March 2025
### Variables
| Column | Description | MERRA-2 Source | Units (raw) |
|--------|--------------------------------------------------|----------------|---------------|
| AOD55 | Aerosol optical depth at 550 nm | TOTEXTTAU | dimensionless |
| ALPHA | Ångström exponent | TOTANGSTR | dimensionless |
| ALBEDO | Surface albedo | ALBEDO | [0–1] |
| TQV | Total column precipitable water vapor | TQV | kg/m² |
| TO3 | Total column ozone | TO3 | Dobson |
| PS | Surface pressure | PS | Pa |
- **Index**: UTC `DatetimeIndex` (hourly, MERRA-2 native resolution).
- **Time coverage**: MERRA-2 spans 1980–present; files are generated only for months with BSRN station-to-archive data on the FTP.
### Use with REST2
These parquet files are designed for the REST2 clear-sky model [2]. The `bsrn` Python package fetches MERRA-2 from this dataset **into RAM** (no disk cache) and provides:
- `fetch_rest2(index, station_code)` — fetch from HF into RAM, reindex to 1-min target, interpolate, derive BETA, and convert units for REST2
- Raw parquet: use `pandas.read_parquet` on a path from `huggingface_hub` (see below)
REST2 expects: **PS** (hPa), **ALBEDO**, **ALPHA**, **BETA** (derived from AOD55 and ALPHA), **TO3** (atm·cm), **TQV** (atm·cm).
**Conversion tips** (raw → REST2):
| Variable | Raw unit | REST2 unit | Conversion |
|----------|---------------|---------------|-------------------------------------------|
| PS | Pa | hPa | ÷ 100 |
| ALBEDO | [0–1] | [0–1] | no conversion |
| ALPHA | dimensionless | dimensionless | no conversion |
| BETA | — | — | AOD55 × 0.55^ALPHA (use 0.001 if AOD55=0) |
| TO3 | Dobson | atm·cm | ÷ 1000 |
| TQV | kg/m² | atm·cm | ÷ 10 |
## Usage
### Load from Hugging Face
```python
from huggingface_hub import hf_hub_download
import pandas as pd
# Download a single file; path is {station}/{station}{MM}{YY}_merra2.parquet
path = hf_hub_download(
repo_id="dazhiyang/bsrn-merra2",
filename="qiq/qiq0124_merra2.parquet",
repo_type="dataset",
)
df = pd.read_parquet(path)
# df has DatetimeIndex (UTC) and columns: AOD55, ALPHA, ALBEDO, TQV, TO3, PS
```
### Use with bsrn package
The `bsrn` package fetches MERRA-2 from Hugging Face **into RAM** (no disk cache). You will see `Fetching MERRA-2 from Hugging Face: {filename}` when it runs.
```python
from bsrn.modeling.clear_sky import add_clearsky_columns
# Option 1: Load raw parquet (from path, e.g. after hf_hub_download)
df = load_merra2_parquet(path)
# Option 2: REST2-ready inputs (fetches from HF into RAM, interpolated to 1-min, units converted)
# target_index = your BSRN 1-min DatetimeIndex
rest2_inputs = fetch_rest2(target_index, station_code="QIQ")
# Option 3: Add clear-sky columns to BSRN data (fetches MERRA-2 from HF into RAM automatically)
df = add_clearsky_columns(df, station_code="QIQ", model="rest2")
```
## Data Sources
- **MERRA-2**: [NASA GMAO](https://gmao.gsfc.nasa.gov/gmao-products/merra-2/), [GES DISC](https://disc.gsfc.nasa.gov/)
- **Extraction**: Google Earth Engine (GEE) — this dataset uses GEE; NCSS is an alternative source.
- **Station inventory**: BSRN FTP (months with `.dat.gz` files only)
> **GEE extraction and validation**
>
> The data in this dataset is **extracted via Google Earth Engine (GEE)**. GEE's MERRA-2 pixel boundaries are offset by half a cell in latitude relative to raw NetCDF. To correct this, a **−0.25° latitude shift** is applied when querying GEE so that the returned pixel aligns with the MERRA-2 grid cell used by raw MERRA-2 and NASA GESDISC NCSS.
>
> **The dataset creator has confirmed the correctness** of the GEE extraction by comparing it against raw MERRA-2 data from NASA and point extractions from NASA GESDISC THREDDS NCSS.
## References
1. Gelaro, R., McCarty, W., Suárez, M. J., Todling, R., Molod, A., Takacs, L., ... & Zhao, B. (2017). The modern-era retrospective analysis for research and applications, version 2 (MERRA-2). *Journal of Climate*, 30(14), 5419–5454.
2. Gueymard, C. A. (2008). REST2: High-performance solar radiation model for cloudless-sky irradiance, illuminance, and photosynthetically active radiation—Validation with a benchmark dataset. *Solar Energy*, 82(3), 272–285.
3. Sun, X., Bright, J. M., Gueymard, C. A., Acord, B., Wang, P., & Engerer, N. A. (2019). Worldwide performance assessment of 75 global clear-sky irradiance models using principal component analysis. *Renewable and Sustainable Energy Reviews*, 111, 550–570.
## Citation
If you use this dataset, please cite the references above and the **bsrn** package: [dazhiyang/bsrn](https://github.com/dazhiyang/bsrn).
## License
This dataset mirrors publicly available MERRA-2 reanalysis data. MERRA-2 is produced by NASA and is freely available. See [NASA's data use policy](https://www.earthdata.nasa.gov/what-is-nasa-earth-data) for terms of use.
提供机构:
dazhiyang



