five

dazhiyang/bsrn-merra2

收藏
Hugging Face2026-04-17 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/dazhiyang/bsrn-merra2
下载链接
链接失效反馈
官方服务:
资源简介:
--- language: en license: mit configs: - config_name: abs data_files: - split: abs path: abs/*.parquet - config_name: ale data_files: - split: ale path: ale/*.parquet - config_name: asp data_files: - split: asp path: asp/*.parquet - config_name: bar data_files: - split: bar path: bar/*.parquet - config_name: ber data_files: - split: ber path: ber/*.parquet - config_name: bil data_files: - split: bil path: bil/*.parquet - config_name: bon data_files: - split: bon path: bon/*.parquet - config_name: bos data_files: - split: bos path: bos/*.parquet - config_name: bou data_files: - split: bou path: bou/*.parquet - config_name: brb data_files: - split: brb path: brb/*.parquet - config_name: bud data_files: - split: bud path: bud/*.parquet - config_name: cab data_files: - split: cab path: cab/*.parquet - config_name: cam data_files: - split: cam path: cam/*.parquet - config_name: cap data_files: - split: cap path: cap/*.parquet - config_name: car data_files: - split: car path: car/*.parquet - config_name: clh data_files: - split: clh path: clh/*.parquet - config_name: cnr data_files: - split: cnr path: cnr/*.parquet - config_name: coc data_files: - split: coc path: coc/*.parquet - config_name: daa data_files: - split: daa path: daa/*.parquet - config_name: dar data_files: - split: dar path: dar/*.parquet - config_name: dom data_files: - split: dom path: dom/*.parquet - config_name: dra data_files: - split: dra path: dra/*.parquet - config_name: dwn data_files: - split: dwn path: dwn/*.parquet - config_name: e13 data_files: - split: e13 path: e13/*.parquet - config_name: ena data_files: - split: ena path: ena/*.parquet - config_name: eur data_files: - split: eur path: eur/*.parquet - config_name: flo data_files: - split: flo path: flo/*.parquet - config_name: fpe data_files: - split: fpe path: fpe/*.parquet - config_name: fua data_files: - split: fua path: fua/*.parquet - config_name: gan data_files: - split: gan path: gan/*.parquet - config_name: gcr data_files: - split: gcr path: gcr/*.parquet - config_name: gim data_files: - split: gim path: gim/*.parquet - config_name: gob data_files: - split: gob path: gob/*.parquet - config_name: gur data_files: - split: gur path: gur/*.parquet - config_name: gvn data_files: - split: gvn path: gvn/*.parquet - config_name: how data_files: - split: how path: how/*.parquet - config_name: ilo data_files: - split: ilo path: ilo/*.parquet - config_name: ino data_files: - split: ino path: ino/*.parquet - config_name: ish data_files: - split: ish path: ish/*.parquet - config_name: iza data_files: - split: iza path: iza/*.parquet - config_name: kwa data_files: - split: kwa path: kwa/*.parquet - config_name: lau data_files: - split: lau path: lau/*.parquet - config_name: ler data_files: - split: ler path: ler/*.parquet - config_name: lin data_files: - split: lin path: lin/*.parquet - config_name: lmp data_files: - split: lmp path: lmp/*.parquet - config_name: lrc data_files: - split: lrc path: lrc/*.parquet - config_name: lyu data_files: - split: lyu path: lyu/*.parquet - config_name: man data_files: - split: man path: man/*.parquet - config_name: mnm data_files: - split: mnm path: mnm/*.parquet - config_name: nau data_files: - split: nau path: nau/*.parquet - config_name: new data_files: - split: new path: new/*.parquet - config_name: nya data_files: - split: nya path: nya/*.parquet - config_name: ohy data_files: - split: ohy path: ohy/*.parquet - config_name: pal data_files: - split: pal path: pal/*.parquet - config_name: par data_files: - split: par path: par/*.parquet - config_name: pay data_files: - split: pay path: pay/*.parquet - config_name: psu data_files: - split: psu path: psu/*.parquet - config_name: ptr data_files: - split: ptr path: ptr/*.parquet - config_name: qiq data_files: - split: qiq path: qiq/*.parquet - config_name: reg data_files: - split: reg path: reg/*.parquet - config_name: rlm data_files: - split: rlm path: rlm/*.parquet - config_name: run data_files: - split: run path: run/*.parquet - config_name: sap data_files: - split: sap path: sap/*.parquet - config_name: sbo data_files: - split: sbo path: sbo/*.parquet - config_name: sel data_files: - split: sel path: sel/*.parquet - config_name: sms data_files: - split: sms path: sms/*.parquet - config_name: son data_files: - split: son path: son/*.parquet - config_name: sov data_files: - split: sov path: sov/*.parquet - config_name: spo data_files: - split: spo path: spo/*.parquet - config_name: sxf data_files: - split: sxf path: sxf/*.parquet - config_name: syo data_files: - split: syo path: syo/*.parquet - config_name: tam data_files: - split: tam path: tam/*.parquet - config_name: tat data_files: - split: tat path: tat/*.parquet - config_name: tik data_files: - split: tik path: tik/*.parquet - config_name: tir data_files: - split: tir path: tir/*.parquet - config_name: tor data_files: - split: tor path: tor/*.parquet - config_name: xia data_files: - split: xia path: xia/*.parquet - config_name: yus data_files: - split: yus path: yus/*.parquet tags: - solar - radiation - bsrn pretty_name: BSRN MERRA-2 Atmospheric Inputs --- <!-- This file is uploaded as the Hugging Face dataset README; keep content user-facing (no internal maintainer or upload-only notes). --> # BSRN MERRA-2 Atmospheric Inputs Point-extracted MERRA-2 reanalysis data for [Baseline Surface Radiation Network (BSRN)](https://bsrn.awi.de/) stations. These parquet files provide atmospheric and aerosol inputs for the **REST2** clear-sky radiation model [2]. ## Dataset Description Each file contains hourly MERRA-2 variables [1] at a single BSRN station location. **Extraction is performed via Google Earth Engine (GEE)** from NASA's 0.5° × 0.625° global grid. Data are aligned to the MERRA-2 grid cell nearest the station coordinates. ### File Structure Station folders use the **lowercase** BSRN three-letter code (e.g. `ber`, `qiq`, or `spo`), with the exception of `e13`. ``` {station}/ {station}{MM}{YY}_merra2.parquet # One file per month ``` Examples: - `qiq/qiq0124_merra2.parquet` — QIQ, January 2024 - `ber/ber0325_merra2.parquet` — BER, March 2025 ### Variables | Column | Description | MERRA-2 Source | Units (raw) | |--------|--------------------------------------------------|----------------|---------------| | AOD55 | Aerosol optical depth at 550 nm | TOTEXTTAU | dimensionless | | ALPHA | Ångström exponent | TOTANGSTR | dimensionless | | ALBEDO | Surface albedo | ALBEDO | [0–1] | | TQV | Total column precipitable water vapor | TQV | kg/m² | | TO3 | Total column ozone | TO3 | Dobson | | PS | Surface pressure | PS | Pa | - **Index**: UTC `DatetimeIndex` (hourly, MERRA-2 native resolution). - **Time coverage**: MERRA-2 spans 1980–present; files are generated only for months with BSRN station-to-archive data on the FTP. ### Use with REST2 These parquet files are designed for the REST2 clear-sky model [2]. The `bsrn` Python package fetches MERRA-2 from this dataset **into RAM** (no disk cache) and provides: - `fetch_rest2(index, station_code)` — fetch from HF into RAM, reindex to 1-min target, interpolate, derive BETA, and convert units for REST2 - Raw parquet: use `pandas.read_parquet` on a path from `huggingface_hub` (see below) REST2 expects: **PS** (hPa), **ALBEDO**, **ALPHA**, **BETA** (derived from AOD55 and ALPHA), **TO3** (atm·cm), **TQV** (atm·cm). **Conversion tips** (raw → REST2): | Variable | Raw unit | REST2 unit | Conversion | |----------|---------------|---------------|-------------------------------------------| | PS | Pa | hPa | ÷ 100 | | ALBEDO | [0–1] | [0–1] | no conversion | | ALPHA | dimensionless | dimensionless | no conversion | | BETA | — | — | AOD55 × 0.55^ALPHA (use 0.001 if AOD55=0) | | TO3 | Dobson | atm·cm | ÷ 1000 | | TQV | kg/m² | atm·cm | ÷ 10 | ## Usage ### Load from Hugging Face ```python from huggingface_hub import hf_hub_download import pandas as pd # Download a single file; path is {station}/{station}{MM}{YY}_merra2.parquet path = hf_hub_download( repo_id="dazhiyang/bsrn-merra2", filename="qiq/qiq0124_merra2.parquet", repo_type="dataset", ) df = pd.read_parquet(path) # df has DatetimeIndex (UTC) and columns: AOD55, ALPHA, ALBEDO, TQV, TO3, PS ``` ### Use with bsrn package The `bsrn` package fetches MERRA-2 from Hugging Face **into RAM** (no disk cache). You will see `Fetching MERRA-2 from Hugging Face: {filename}` when it runs. ```python from bsrn.modeling.clear_sky import add_clearsky_columns # Option 1: Load raw parquet (from path, e.g. after hf_hub_download) df = load_merra2_parquet(path) # Option 2: REST2-ready inputs (fetches from HF into RAM, interpolated to 1-min, units converted) # target_index = your BSRN 1-min DatetimeIndex rest2_inputs = fetch_rest2(target_index, station_code="QIQ") # Option 3: Add clear-sky columns to BSRN data (fetches MERRA-2 from HF into RAM automatically) df = add_clearsky_columns(df, station_code="QIQ", model="rest2") ``` ## Data Sources - **MERRA-2**: [NASA GMAO](https://gmao.gsfc.nasa.gov/gmao-products/merra-2/), [GES DISC](https://disc.gsfc.nasa.gov/) - **Extraction**: Google Earth Engine (GEE) — this dataset uses GEE; NCSS is an alternative source. - **Station inventory**: BSRN FTP (months with `.dat.gz` files only) > **GEE extraction and validation** > > The data in this dataset is **extracted via Google Earth Engine (GEE)**. GEE's MERRA-2 pixel boundaries are offset by half a cell in latitude relative to raw NetCDF. To correct this, a **−0.25° latitude shift** is applied when querying GEE so that the returned pixel aligns with the MERRA-2 grid cell used by raw MERRA-2 and NASA GESDISC NCSS. > > **The dataset creator has confirmed the correctness** of the GEE extraction by comparing it against raw MERRA-2 data from NASA and point extractions from NASA GESDISC THREDDS NCSS. ## References 1. Gelaro, R., McCarty, W., Suárez, M. J., Todling, R., Molod, A., Takacs, L., ... & Zhao, B. (2017). The modern-era retrospective analysis for research and applications, version 2 (MERRA-2). *Journal of Climate*, 30(14), 5419–5454. 2. Gueymard, C. A. (2008). REST2: High-performance solar radiation model for cloudless-sky irradiance, illuminance, and photosynthetically active radiation—Validation with a benchmark dataset. *Solar Energy*, 82(3), 272–285. 3. Sun, X., Bright, J. M., Gueymard, C. A., Acord, B., Wang, P., & Engerer, N. A. (2019). Worldwide performance assessment of 75 global clear-sky irradiance models using principal component analysis. *Renewable and Sustainable Energy Reviews*, 111, 550–570. ## Citation If you use this dataset, please cite the references above and the **bsrn** package: [dazhiyang/bsrn](https://github.com/dazhiyang/bsrn). ## License This dataset mirrors publicly available MERRA-2 reanalysis data. MERRA-2 is produced by NASA and is freely available. See [NASA's data use policy](https://www.earthdata.nasa.gov/what-is-nasa-earth-data) for terms of use.
提供机构:
dazhiyang
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作